How to find utf8_string in another utf8_string using tinyutf8 in C++11?

I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8

I'm trying to call utf8_string::find_first_of passing a utf8_string as the first parameter.

This generates the following error:

error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’

How can I get a char32_t* from my utf8_string?
Alternatively, what other mechanism is there to find a utf8_string within another utf8_string?

Thanks!
Shawn

asked Sep 15 '18 at 1:46

Shawn McMurdo

If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.

– Cheers and hth. - Alf
Sep 15 '18 at 3:07

Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.

– Shawn McMurdo
Sep 19 '18 at 6:41

add a comment |

I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8

I'm trying to call utf8_string::find_first_of passing a utf8_string as the first parameter.

This generates the following error:

error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’

How can I get a char32_t* from my utf8_string?
Alternatively, what other mechanism is there to find a utf8_string within another utf8_string?

Thanks!
Shawn

asked Sep 15 '18 at 1:46

Shawn McMurdo

If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.

– Cheers and hth. - Alf
Sep 15 '18 at 3:07

Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.

– Shawn McMurdo
Sep 19 '18 at 6:41

add a comment |

I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8

I'm trying to call utf8_string::find_first_of passing a utf8_string as the first parameter.

This generates the following error:

error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’

How can I get a char32_t* from my utf8_string?
Alternatively, what other mechanism is there to find a utf8_string within another utf8_string?

Thanks!
Shawn

asked Sep 15 '18 at 1:46

Shawn McMurdo

I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8

I'm trying to call utf8_string::find_first_of passing a utf8_string as the first parameter.

This generates the following error:

error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’

How can I get a char32_t* from my utf8_string?
Alternatively, what other mechanism is there to find a utf8_string within another utf8_string?

Thanks!
Shawn

c++ string c++11 unicode utf-8

asked Sep 15 '18 at 1:46

Shawn McMurdo

asked Sep 15 '18 at 1:46

Shawn McMurdo

asked Sep 15 '18 at 1:46

Shawn McMurdo

asked Sep 15 '18 at 1:46

Shawn McMurdo

asked Sep 15 '18 at 1:46

Shawn McMurdo

If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.

– Cheers and hth. - Alf
Sep 15 '18 at 3:07

Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.

– Shawn McMurdo
Sep 19 '18 at 6:41

add a comment |

If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.

– Cheers and hth. - Alf
Sep 15 '18 at 3:07

Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.

– Shawn McMurdo
Sep 19 '18 at 6:41

If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.

– Cheers and hth. - Alf
Sep 15 '18 at 3:07

Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.

– Shawn McMurdo
Sep 19 '18 at 6:41

add a comment |

1 Answer
1

active

oldest

votes

Shawn, currently tiny_utf8 does not support find_first_of with a utf8_string as argument. However, to answer your second question: You can convert a utf8_string to a char32_t using utf8_string::to_wide_literal( &char32_buffer ).

I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).

All the best,
Jakob

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52340945%2fhow-to-find-utf8-string-in-another-utf8-string-using-tinyutf8-in-c11%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).

All the best,
Jakob

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

add a comment |

I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).

All the best,
Jakob

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

add a comment |

I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).

All the best,
Jakob

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).

All the best,
Jakob

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

answered Nov 14 '18 at 21:11

Jakob Riedle

1,5051318

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

p217w,I0jeA3AdOozdqUMupSnsLH9yRzupnQAMwTQEmmEbfs D1R59vVFN9g

搜尋此網誌

Odtnhj