Ignoring Repeated Sets of Logging
There is a program that is logging a lot of information, some of which tends to repeat itself in certain situations. I've been tasked with preventing this over-logging and I need some direction as to what to do. Currently I can prevent the same message in a row from repeating, but it becomes a lot more tricky when preventing various sized unique sets of log messages from occurring.
I tried to first break down the problem into a smaller one by which I am using characters to represent unique log messages.
Input
aaaaababababacdefgfggfabcddggddgg
Output
abacdefgfabcdg
0 : Hash: 100, Msg: d
1 : Hash: 103, Msg: g
This is the output for my current program which seems to work for 1-2 unique characters. All the characters displayed in a row represent the messages that are logged and the next two lines show what my current buffer used to compare sequences contains. Therefore if I were to add "xz" to the original input I get the following output
abacdefgfabcdgdg
0 : Hash: 120, Msg: x
1 : Hash: 122, Msg: z
As we can see the d
, g
get "logged" and the next sequence we don't allow to occur would be 'xz' or 'z' based on what's in the sequence buffer.
Does anyone know of an actual algorithm that can detect/prevent repeated unique sequences?
I've looked at this but it doesn't quite fit my needs.
More Examples
Input -> Desired Output
- ABCABCACD -> ABCACD
- AAABABACBCBCB -> ABACB
- ABBABBA -> ABABA
- ACDACDDDCDC -> ACDC
The letters represent unique log messages and I'd like to prevent the repeated sets of log messages from showing up.
c++ algorithm logging
|
show 10 more comments
There is a program that is logging a lot of information, some of which tends to repeat itself in certain situations. I've been tasked with preventing this over-logging and I need some direction as to what to do. Currently I can prevent the same message in a row from repeating, but it becomes a lot more tricky when preventing various sized unique sets of log messages from occurring.
I tried to first break down the problem into a smaller one by which I am using characters to represent unique log messages.
Input
aaaaababababacdefgfggfabcddggddgg
Output
abacdefgfabcdg
0 : Hash: 100, Msg: d
1 : Hash: 103, Msg: g
This is the output for my current program which seems to work for 1-2 unique characters. All the characters displayed in a row represent the messages that are logged and the next two lines show what my current buffer used to compare sequences contains. Therefore if I were to add "xz" to the original input I get the following output
abacdefgfabcdgdg
0 : Hash: 120, Msg: x
1 : Hash: 122, Msg: z
As we can see the d
, g
get "logged" and the next sequence we don't allow to occur would be 'xz' or 'z' based on what's in the sequence buffer.
Does anyone know of an actual algorithm that can detect/prevent repeated unique sequences?
I've looked at this but it doesn't quite fit my needs.
More Examples
Input -> Desired Output
- ABCABCACD -> ABCACD
- AAABABACBCBCB -> ABACB
- ABBABBA -> ABABA
- ACDACDDDCDC -> ACDC
The letters represent unique log messages and I'd like to prevent the repeated sets of log messages from showing up.
c++ algorithm logging
What is the size of the problem? I.e. give us a rough estimate of how many log messages/sec. Do you want the messages to be unique for the entire log? That could be hard with a large log. One way to restrict problem size would be to process the log in blocks, and only guarantee that a message is unique within its own block.
– Alex
Nov 14 '18 at 20:49
This sounds bad. There is already alot of overhead in logging, why would you want to make it worse by tracking logged messages in the process that logs them? Why would you only want to see that first message? It looks like what you really need is a mainstream logger that provides filtering and/or to reevaluate how and what you are logging. If this is for some real world production project, I'd surely raise an eyebrow. Not enough context is provided to really suggest anything.
– Christopher Pisz
Nov 14 '18 at 21:01
Look up compression algorithms. You are compressing the output (logging) stream.
– Thomas Matthews
Nov 14 '18 at 21:16
Do you need to log date or time stamps?
– Thomas Matthews
Nov 14 '18 at 21:18
The problem I would like to solve is the logs can, if the application is in a broken state, fill the logs with repeated sequence of messages. And so troubleshooting becomes impossible because the useful log messages have since been overwritten @ChristopherPisz
– RAZ_Muh_Taz
Nov 14 '18 at 21:54
|
show 10 more comments
There is a program that is logging a lot of information, some of which tends to repeat itself in certain situations. I've been tasked with preventing this over-logging and I need some direction as to what to do. Currently I can prevent the same message in a row from repeating, but it becomes a lot more tricky when preventing various sized unique sets of log messages from occurring.
I tried to first break down the problem into a smaller one by which I am using characters to represent unique log messages.
Input
aaaaababababacdefgfggfabcddggddgg
Output
abacdefgfabcdg
0 : Hash: 100, Msg: d
1 : Hash: 103, Msg: g
This is the output for my current program which seems to work for 1-2 unique characters. All the characters displayed in a row represent the messages that are logged and the next two lines show what my current buffer used to compare sequences contains. Therefore if I were to add "xz" to the original input I get the following output
abacdefgfabcdgdg
0 : Hash: 120, Msg: x
1 : Hash: 122, Msg: z
As we can see the d
, g
get "logged" and the next sequence we don't allow to occur would be 'xz' or 'z' based on what's in the sequence buffer.
Does anyone know of an actual algorithm that can detect/prevent repeated unique sequences?
I've looked at this but it doesn't quite fit my needs.
More Examples
Input -> Desired Output
- ABCABCACD -> ABCACD
- AAABABACBCBCB -> ABACB
- ABBABBA -> ABABA
- ACDACDDDCDC -> ACDC
The letters represent unique log messages and I'd like to prevent the repeated sets of log messages from showing up.
c++ algorithm logging
There is a program that is logging a lot of information, some of which tends to repeat itself in certain situations. I've been tasked with preventing this over-logging and I need some direction as to what to do. Currently I can prevent the same message in a row from repeating, but it becomes a lot more tricky when preventing various sized unique sets of log messages from occurring.
I tried to first break down the problem into a smaller one by which I am using characters to represent unique log messages.
Input
aaaaababababacdefgfggfabcddggddgg
Output
abacdefgfabcdg
0 : Hash: 100, Msg: d
1 : Hash: 103, Msg: g
This is the output for my current program which seems to work for 1-2 unique characters. All the characters displayed in a row represent the messages that are logged and the next two lines show what my current buffer used to compare sequences contains. Therefore if I were to add "xz" to the original input I get the following output
abacdefgfabcdgdg
0 : Hash: 120, Msg: x
1 : Hash: 122, Msg: z
As we can see the d
, g
get "logged" and the next sequence we don't allow to occur would be 'xz' or 'z' based on what's in the sequence buffer.
Does anyone know of an actual algorithm that can detect/prevent repeated unique sequences?
I've looked at this but it doesn't quite fit my needs.
More Examples
Input -> Desired Output
- ABCABCACD -> ABCACD
- AAABABACBCBCB -> ABACB
- ABBABBA -> ABABA
- ACDACDDDCDC -> ACDC
The letters represent unique log messages and I'd like to prevent the repeated sets of log messages from showing up.
c++ algorithm logging
c++ algorithm logging
edited Nov 14 '18 at 22:39
RAZ_Muh_Taz
asked Nov 14 '18 at 20:27
RAZ_Muh_TazRAZ_Muh_Taz
3,6171822
3,6171822
What is the size of the problem? I.e. give us a rough estimate of how many log messages/sec. Do you want the messages to be unique for the entire log? That could be hard with a large log. One way to restrict problem size would be to process the log in blocks, and only guarantee that a message is unique within its own block.
– Alex
Nov 14 '18 at 20:49
This sounds bad. There is already alot of overhead in logging, why would you want to make it worse by tracking logged messages in the process that logs them? Why would you only want to see that first message? It looks like what you really need is a mainstream logger that provides filtering and/or to reevaluate how and what you are logging. If this is for some real world production project, I'd surely raise an eyebrow. Not enough context is provided to really suggest anything.
– Christopher Pisz
Nov 14 '18 at 21:01
Look up compression algorithms. You are compressing the output (logging) stream.
– Thomas Matthews
Nov 14 '18 at 21:16
Do you need to log date or time stamps?
– Thomas Matthews
Nov 14 '18 at 21:18
The problem I would like to solve is the logs can, if the application is in a broken state, fill the logs with repeated sequence of messages. And so troubleshooting becomes impossible because the useful log messages have since been overwritten @ChristopherPisz
– RAZ_Muh_Taz
Nov 14 '18 at 21:54
|
show 10 more comments
What is the size of the problem? I.e. give us a rough estimate of how many log messages/sec. Do you want the messages to be unique for the entire log? That could be hard with a large log. One way to restrict problem size would be to process the log in blocks, and only guarantee that a message is unique within its own block.
– Alex
Nov 14 '18 at 20:49
This sounds bad. There is already alot of overhead in logging, why would you want to make it worse by tracking logged messages in the process that logs them? Why would you only want to see that first message? It looks like what you really need is a mainstream logger that provides filtering and/or to reevaluate how and what you are logging. If this is for some real world production project, I'd surely raise an eyebrow. Not enough context is provided to really suggest anything.
– Christopher Pisz
Nov 14 '18 at 21:01
Look up compression algorithms. You are compressing the output (logging) stream.
– Thomas Matthews
Nov 14 '18 at 21:16
Do you need to log date or time stamps?
– Thomas Matthews
Nov 14 '18 at 21:18
The problem I would like to solve is the logs can, if the application is in a broken state, fill the logs with repeated sequence of messages. And so troubleshooting becomes impossible because the useful log messages have since been overwritten @ChristopherPisz
– RAZ_Muh_Taz
Nov 14 '18 at 21:54
What is the size of the problem? I.e. give us a rough estimate of how many log messages/sec. Do you want the messages to be unique for the entire log? That could be hard with a large log. One way to restrict problem size would be to process the log in blocks, and only guarantee that a message is unique within its own block.
– Alex
Nov 14 '18 at 20:49
What is the size of the problem? I.e. give us a rough estimate of how many log messages/sec. Do you want the messages to be unique for the entire log? That could be hard with a large log. One way to restrict problem size would be to process the log in blocks, and only guarantee that a message is unique within its own block.
– Alex
Nov 14 '18 at 20:49
This sounds bad. There is already alot of overhead in logging, why would you want to make it worse by tracking logged messages in the process that logs them? Why would you only want to see that first message? It looks like what you really need is a mainstream logger that provides filtering and/or to reevaluate how and what you are logging. If this is for some real world production project, I'd surely raise an eyebrow. Not enough context is provided to really suggest anything.
– Christopher Pisz
Nov 14 '18 at 21:01
This sounds bad. There is already alot of overhead in logging, why would you want to make it worse by tracking logged messages in the process that logs them? Why would you only want to see that first message? It looks like what you really need is a mainstream logger that provides filtering and/or to reevaluate how and what you are logging. If this is for some real world production project, I'd surely raise an eyebrow. Not enough context is provided to really suggest anything.
– Christopher Pisz
Nov 14 '18 at 21:01
Look up compression algorithms. You are compressing the output (logging) stream.
– Thomas Matthews
Nov 14 '18 at 21:16
Look up compression algorithms. You are compressing the output (logging) stream.
– Thomas Matthews
Nov 14 '18 at 21:16
Do you need to log date or time stamps?
– Thomas Matthews
Nov 14 '18 at 21:18
Do you need to log date or time stamps?
– Thomas Matthews
Nov 14 '18 at 21:18
The problem I would like to solve is the logs can, if the application is in a broken state, fill the logs with repeated sequence of messages. And so troubleshooting becomes impossible because the useful log messages have since been overwritten @ChristopherPisz
– RAZ_Muh_Taz
Nov 14 '18 at 21:54
The problem I would like to solve is the logs can, if the application is in a broken state, fill the logs with repeated sequence of messages. And so troubleshooting becomes impossible because the useful log messages have since been overwritten @ChristopherPisz
– RAZ_Muh_Taz
Nov 14 '18 at 21:54
|
show 10 more comments
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53308248%2fignoring-repeated-sets-of-logging%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53308248%2fignoring-repeated-sets-of-logging%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is the size of the problem? I.e. give us a rough estimate of how many log messages/sec. Do you want the messages to be unique for the entire log? That could be hard with a large log. One way to restrict problem size would be to process the log in blocks, and only guarantee that a message is unique within its own block.
– Alex
Nov 14 '18 at 20:49
This sounds bad. There is already alot of overhead in logging, why would you want to make it worse by tracking logged messages in the process that logs them? Why would you only want to see that first message? It looks like what you really need is a mainstream logger that provides filtering and/or to reevaluate how and what you are logging. If this is for some real world production project, I'd surely raise an eyebrow. Not enough context is provided to really suggest anything.
– Christopher Pisz
Nov 14 '18 at 21:01
Look up compression algorithms. You are compressing the output (logging) stream.
– Thomas Matthews
Nov 14 '18 at 21:16
Do you need to log date or time stamps?
– Thomas Matthews
Nov 14 '18 at 21:18
The problem I would like to solve is the logs can, if the application is in a broken state, fill the logs with repeated sequence of messages. And so troubleshooting becomes impossible because the useful log messages have since been overwritten @ChristopherPisz
– RAZ_Muh_Taz
Nov 14 '18 at 21:54