How to find utf8_string in another utf8_string using tinyutf8 in C++11?
I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8
I'm trying to call utf8_string::find_first_of
passing a utf8_string as the first parameter.
This generates the following error:
error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’
How can I get a char32_t*
from my utf8_string
?
Alternatively, what other mechanism is there to find a utf8_string
within another utf8_string
?
Thanks!
Shawn
c++ string c++11 unicode utf-8
add a comment |
I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8
I'm trying to call utf8_string::find_first_of
passing a utf8_string as the first parameter.
This generates the following error:
error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’
How can I get a char32_t*
from my utf8_string
?
Alternatively, what other mechanism is there to find a utf8_string
within another utf8_string
?
Thanks!
Shawn
c++ string c++11 unicode utf-8
If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.
– Cheers and hth. - Alf
Sep 15 '18 at 3:07
Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.
– Shawn McMurdo
Sep 19 '18 at 6:41
add a comment |
I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8
I'm trying to call utf8_string::find_first_of
passing a utf8_string as the first parameter.
This generates the following error:
error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’
How can I get a char32_t*
from my utf8_string
?
Alternatively, what other mechanism is there to find a utf8_string
within another utf8_string
?
Thanks!
Shawn
c++ string c++11 unicode utf-8
I am using tinyutf8 C++ UTF-8 string library from
https://github.com/DuffsDevice/tinyutf8
I'm trying to call utf8_string::find_first_of
passing a utf8_string as the first parameter.
This generates the following error:
error: no matching function for call to ‘utf8_string::find_first_of(utf8_string&, int&)’
int found_pos = haystack.find_first_of(needle, at_pos);
^
In file included from Phonemizer.cpp:8:0:
tinyutf8.h:1728:12: note: candidate: utf8_string::size_type utf8_string::find_first_of(const value_type*, utf8_string::size_type) const
size_type find_first_of( const value_type* str , size_type start_codepoint = 0 ) const ;
^~~~~~~~~~~~~
tinyutf8.h:1728:12: note: no known conversion for argument 1 from ‘utf8_string’ to ‘const value_type* aka const char32_t*’
How can I get a char32_t*
from my utf8_string
?
Alternatively, what other mechanism is there to find a utf8_string
within another utf8_string
?
Thanks!
Shawn
c++ string c++11 unicode utf-8
c++ string c++11 unicode utf-8
asked Sep 15 '18 at 1:46
Shawn McMurdoShawn McMurdo
61
61
If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.
– Cheers and hth. - Alf
Sep 15 '18 at 3:07
Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.
– Shawn McMurdo
Sep 19 '18 at 6:41
add a comment |
If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.
– Cheers and hth. - Alf
Sep 15 '18 at 3:07
Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.
– Shawn McMurdo
Sep 19 '18 at 6:41
If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.
– Cheers and hth. - Alf
Sep 15 '18 at 3:07
If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.
– Cheers and hth. - Alf
Sep 15 '18 at 3:07
Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.
– Shawn McMurdo
Sep 19 '18 at 6:41
Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.
– Shawn McMurdo
Sep 19 '18 at 6:41
add a comment |
1 Answer
1
active
oldest
votes
Shawn, currently tiny_utf8 does not support find_first_of
with a utf8_string
as argument. However, to answer your second question: You can convert a utf8_string
to a char32_t
using utf8_string::to_wide_literal( &char32_buffer )
.
I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).
All the best,
Jakob
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52340945%2fhow-to-find-utf8-string-in-another-utf8-string-using-tinyutf8-in-c11%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Shawn, currently tiny_utf8 does not support find_first_of
with a utf8_string
as argument. However, to answer your second question: You can convert a utf8_string
to a char32_t
using utf8_string::to_wide_literal( &char32_buffer )
.
I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).
All the best,
Jakob
add a comment |
Shawn, currently tiny_utf8 does not support find_first_of
with a utf8_string
as argument. However, to answer your second question: You can convert a utf8_string
to a char32_t
using utf8_string::to_wide_literal( &char32_buffer )
.
I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).
All the best,
Jakob
add a comment |
Shawn, currently tiny_utf8 does not support find_first_of
with a utf8_string
as argument. However, to answer your second question: You can convert a utf8_string
to a char32_t
using utf8_string::to_wide_literal( &char32_buffer )
.
I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).
All the best,
Jakob
Shawn, currently tiny_utf8 does not support find_first_of
with a utf8_string
as argument. However, to answer your second question: You can convert a utf8_string
to a char32_t
using utf8_string::to_wide_literal( &char32_buffer )
.
I hope this helps at least a little bit (even though you said you fixed the problem yourself already, which I am glad to hear :D).
All the best,
Jakob
answered Nov 14 '18 at 21:11
Jakob RiedleJakob Riedle
1,5051318
1,5051318
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52340945%2fhow-to-find-utf8-string-in-another-utf8-string-using-tinyutf8-in-c11%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If you're not particular you can just search for the byte sequence. The standard library has lots of find functions. If you're particular you'll have to use a library to convert both search string and text to search in to a canonical form for Unicode, to ensure that characters like "é" (for example) are represented as the same sequence of code points.
– Cheers and hth. - Alf
Sep 15 '18 at 3:07
Thanks @Alf for the helpful comment. I started down the path of doing the byte sequence search, getting a raw iterator and working back to a codepoint index but then I realized I could use find instead of find_first_of which accepts a utf8_string parameter.
– Shawn McMurdo
Sep 19 '18 at 6:41