XQuery decoding HTTP request - unable to parse query










1















In XQuery 3.1 (under eXist-db 4.4) I receive search requests to the a controller where I create a parameter docset from the URL's query string text:



else if (starts-with(lower-case($exist:path), "/search")) then
<dispatch xmlns="http://exist.sourceforge.net/NS/exist">
<forward url="$exist:controller/search.html"/>
<view>
<forward url="$exist:controller/modules/view.xql">
<add-parameter name="docset"
value="search:search-term-cleaner(request:get-parameter("text","norequest"))"/>
<add-parameter name="pagetype" value="search"/>
</forward>
</view>
</dispatch>


I clean any incoming such requests to /search?text="" to permit only certain characters into the search query:



declare function search:search-term-cleaner($text as xs:string?) as xs:string?

let $cleanterm := replace($text,'[^A-Za-z+*0-9]', '')

return $cleanterm
;


There are two problems, under two slightly different scenarios:



  1. If the request comes in /search?text=some%+text the site complains with


org.eclipse.jetty.http.BadMessageException: 400: Unable to parse URI query
java.lang.IllegalArgumentException: Not valid encoding '%+t'




  1. If the request comes in /search?text=some+text, the controller passes through sometext without the permitted + sign

Googling this has not lead me to a solution, but I am not experienced in managing HTTP parsing and may not understand the problem enough to search for the solution.



This is via local host http://localhost:8081/exist/apps/.










share|improve this question



















  • 1





    When getting parameters via request:get-parameter you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you via as space characters.

    – joewiz
    Nov 16 '18 at 3:32







  • 1





    @joewiz I can't believe I struggled for hours without figuring that (now obvious) fact. If you post that as an answer I'll accept. It might be useful for some future searcher. Thanks again.

    – jbrehr
    Nov 16 '18 at 9:56















1















In XQuery 3.1 (under eXist-db 4.4) I receive search requests to the a controller where I create a parameter docset from the URL's query string text:



else if (starts-with(lower-case($exist:path), "/search")) then
<dispatch xmlns="http://exist.sourceforge.net/NS/exist">
<forward url="$exist:controller/search.html"/>
<view>
<forward url="$exist:controller/modules/view.xql">
<add-parameter name="docset"
value="search:search-term-cleaner(request:get-parameter("text","norequest"))"/>
<add-parameter name="pagetype" value="search"/>
</forward>
</view>
</dispatch>


I clean any incoming such requests to /search?text="" to permit only certain characters into the search query:



declare function search:search-term-cleaner($text as xs:string?) as xs:string?

let $cleanterm := replace($text,'[^A-Za-z+*0-9]', '')

return $cleanterm
;


There are two problems, under two slightly different scenarios:



  1. If the request comes in /search?text=some%+text the site complains with


org.eclipse.jetty.http.BadMessageException: 400: Unable to parse URI query
java.lang.IllegalArgumentException: Not valid encoding '%+t'




  1. If the request comes in /search?text=some+text, the controller passes through sometext without the permitted + sign

Googling this has not lead me to a solution, but I am not experienced in managing HTTP parsing and may not understand the problem enough to search for the solution.



This is via local host http://localhost:8081/exist/apps/.










share|improve this question



















  • 1





    When getting parameters via request:get-parameter you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you via as space characters.

    – joewiz
    Nov 16 '18 at 3:32







  • 1





    @joewiz I can't believe I struggled for hours without figuring that (now obvious) fact. If you post that as an answer I'll accept. It might be useful for some future searcher. Thanks again.

    – jbrehr
    Nov 16 '18 at 9:56













1












1








1








In XQuery 3.1 (under eXist-db 4.4) I receive search requests to the a controller where I create a parameter docset from the URL's query string text:



else if (starts-with(lower-case($exist:path), "/search")) then
<dispatch xmlns="http://exist.sourceforge.net/NS/exist">
<forward url="$exist:controller/search.html"/>
<view>
<forward url="$exist:controller/modules/view.xql">
<add-parameter name="docset"
value="search:search-term-cleaner(request:get-parameter("text","norequest"))"/>
<add-parameter name="pagetype" value="search"/>
</forward>
</view>
</dispatch>


I clean any incoming such requests to /search?text="" to permit only certain characters into the search query:



declare function search:search-term-cleaner($text as xs:string?) as xs:string?

let $cleanterm := replace($text,'[^A-Za-z+*0-9]', '')

return $cleanterm
;


There are two problems, under two slightly different scenarios:



  1. If the request comes in /search?text=some%+text the site complains with


org.eclipse.jetty.http.BadMessageException: 400: Unable to parse URI query
java.lang.IllegalArgumentException: Not valid encoding '%+t'




  1. If the request comes in /search?text=some+text, the controller passes through sometext without the permitted + sign

Googling this has not lead me to a solution, but I am not experienced in managing HTTP parsing and may not understand the problem enough to search for the solution.



This is via local host http://localhost:8081/exist/apps/.










share|improve this question
















In XQuery 3.1 (under eXist-db 4.4) I receive search requests to the a controller where I create a parameter docset from the URL's query string text:



else if (starts-with(lower-case($exist:path), "/search")) then
<dispatch xmlns="http://exist.sourceforge.net/NS/exist">
<forward url="$exist:controller/search.html"/>
<view>
<forward url="$exist:controller/modules/view.xql">
<add-parameter name="docset"
value="search:search-term-cleaner(request:get-parameter("text","norequest"))"/>
<add-parameter name="pagetype" value="search"/>
</forward>
</view>
</dispatch>


I clean any incoming such requests to /search?text="" to permit only certain characters into the search query:



declare function search:search-term-cleaner($text as xs:string?) as xs:string?

let $cleanterm := replace($text,'[^A-Za-z+*0-9]', '')

return $cleanterm
;


There are two problems, under two slightly different scenarios:



  1. If the request comes in /search?text=some%+text the site complains with


org.eclipse.jetty.http.BadMessageException: 400: Unable to parse URI query
java.lang.IllegalArgumentException: Not valid encoding '%+t'




  1. If the request comes in /search?text=some+text, the controller passes through sometext without the permitted + sign

Googling this has not lead me to a solution, but I am not experienced in managing HTTP parsing and may not understand the problem enough to search for the solution.



This is via local host http://localhost:8081/exist/apps/.







xpath xquery exist-db






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 18:08







jbrehr

















asked Nov 15 '18 at 17:08









jbrehrjbrehr

134212




134212







  • 1





    When getting parameters via request:get-parameter you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you via as space characters.

    – joewiz
    Nov 16 '18 at 3:32







  • 1





    @joewiz I can't believe I struggled for hours without figuring that (now obvious) fact. If you post that as an answer I'll accept. It might be useful for some future searcher. Thanks again.

    – jbrehr
    Nov 16 '18 at 9:56












  • 1





    When getting parameters via request:get-parameter you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you via as space characters.

    – joewiz
    Nov 16 '18 at 3:32







  • 1





    @joewiz I can't believe I struggled for hours without figuring that (now obvious) fact. If you post that as an answer I'll accept. It might be useful for some future searcher. Thanks again.

    – jbrehr
    Nov 16 '18 at 9:56







1




1





When getting parameters via request:get-parameter you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you via as space characters.

– joewiz
Nov 16 '18 at 3:32






When getting parameters via request:get-parameter you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you via as space characters.

– joewiz
Nov 16 '18 at 3:32





1




1





@joewiz I can't believe I struggled for hours without figuring that (now obvious) fact. If you post that as an answer I'll accept. It might be useful for some future searcher. Thanks again.

– jbrehr
Nov 16 '18 at 9:56





@joewiz I can't believe I struggled for hours without figuring that (now obvious) fact. If you post that as an answer I'll accept. It might be useful for some future searcher. Thanks again.

– jbrehr
Nov 16 '18 at 9:56












2 Answers
2






active

oldest

votes


















1














When getting parameters via request:get-parameter() you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you as space characters.






share|improve this answer






























    1














    functions such as util:unescape-uri and escape-uri are your friends.
    Since the string you are working with gets send over http it will undergo escaping. You can find out more about available escaping functions by searching for escape in the function documentation



    for more elaborate operations consider normalize-unicode






    share|improve this answer

























    • In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

      – jbrehr
      Nov 15 '18 at 20:29











    • the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

      – duncdrum
      Nov 15 '18 at 20:45












    • Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

      – jbrehr
      Nov 15 '18 at 20:54










    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53324625%2fxquery-decoding-http-request-unable-to-parse-query%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    When getting parameters via request:get-parameter() you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you as space characters.






    share|improve this answer



























      1














      When getting parameters via request:get-parameter() you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you as space characters.






      share|improve this answer

























        1












        1








        1







        When getting parameters via request:get-parameter() you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you as space characters.






        share|improve this answer













        When getting parameters via request:get-parameter() you don’t need to unescape parameters that are URI-encoded. %20 and + are automatically handed to you as space characters.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 16 '18 at 14:51









        joewizjoewiz

        3,9821220




        3,9821220























            1














            functions such as util:unescape-uri and escape-uri are your friends.
            Since the string you are working with gets send over http it will undergo escaping. You can find out more about available escaping functions by searching for escape in the function documentation



            for more elaborate operations consider normalize-unicode






            share|improve this answer

























            • In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

              – jbrehr
              Nov 15 '18 at 20:29











            • the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

              – duncdrum
              Nov 15 '18 at 20:45












            • Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

              – jbrehr
              Nov 15 '18 at 20:54















            1














            functions such as util:unescape-uri and escape-uri are your friends.
            Since the string you are working with gets send over http it will undergo escaping. You can find out more about available escaping functions by searching for escape in the function documentation



            for more elaborate operations consider normalize-unicode






            share|improve this answer

























            • In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

              – jbrehr
              Nov 15 '18 at 20:29











            • the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

              – duncdrum
              Nov 15 '18 at 20:45












            • Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

              – jbrehr
              Nov 15 '18 at 20:54













            1












            1








            1







            functions such as util:unescape-uri and escape-uri are your friends.
            Since the string you are working with gets send over http it will undergo escaping. You can find out more about available escaping functions by searching for escape in the function documentation



            for more elaborate operations consider normalize-unicode






            share|improve this answer















            functions such as util:unescape-uri and escape-uri are your friends.
            Since the string you are working with gets send over http it will undergo escaping. You can find out more about available escaping functions by searching for escape in the function documentation



            for more elaborate operations consider normalize-unicode







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 15 '18 at 20:25

























            answered Nov 15 '18 at 20:18









            duncdrumduncdrum

            548411




            548411












            • In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

              – jbrehr
              Nov 15 '18 at 20:29











            • the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

              – duncdrum
              Nov 15 '18 at 20:45












            • Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

              – jbrehr
              Nov 15 '18 at 20:54

















            • In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

              – jbrehr
              Nov 15 '18 at 20:29











            • the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

              – duncdrum
              Nov 15 '18 at 20:45












            • Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

              – jbrehr
              Nov 15 '18 at 20:54
















            In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

            – jbrehr
            Nov 15 '18 at 20:29





            In fact, one problem is the threat that someone can just dump a 'forbidden' character into the search string such as the example I gave some%+text. I don't actually understand what is happening when that hits the controller - therefore I can't figure out a tactic for handling it.

            – jbrehr
            Nov 15 '18 at 20:29













            the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

            – duncdrum
            Nov 15 '18 at 20:45






            the best thing, imv you can do is to uri encode in the input form, and uri-decode in your xquery. If you really want to limit the characters (ǚ, , ` ` ) ppl can search for, more elaborate functions are necessary. But the basic idea remains the same, encode input decode for processing

            – duncdrum
            Nov 15 '18 at 20:45














            Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

            – jbrehr
            Nov 15 '18 at 20:54





            Yes, the input is encoded in the form which I handle fine. I'm trying to handle the contingency where a user writes the string directly in the browser (in the first example).

            – jbrehr
            Nov 15 '18 at 20:54

















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53324625%2fxquery-decoding-http-request-unable-to-parse-query%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            How to read a connectionString WITH PROVIDER in .NET Core?

            In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

            Museum of Modern and Contemporary Art of Trento and Rovereto