How to Compile Equivalent SQL code written in Python / Spark









up vote
-2
down vote

favorite












I am trying to clean data using Pyspark techniques with an application called Optimus, see link
https://github.com/ironmussa/Optimus



As you can see it uses the following lines of code to transform the data:



df
.rows.sort("product","desc")
.cols.lower(["firstName","lastName"])
.cols.date_transform("birth", "new_date", "yyyy/MM/dd", "dd-MM-YYYY")
.cols.years_between("birth", "years_between", "yyyy/MM/dd")
.cols.remove_accents("lastName")
.cols.remove_special_chars("lastName")
.cols.replace("product","taaaccoo","taco")
.cols.replace("product",["piza","pizzza"],"pizza")
.rows.drop(df["id"]<7)
.cols.drop("dummyCol")
.cols.rename(str.lower)
.cols.apply_by_dtypes("product",func,"string", data_type="integer")
.cols.trim("*")
.show()


Can someone let me know what the equivalent commands would be in SQL?










share|improve this question





















  • How far have you gotten before posting this question?
    – cricket_007
    Nov 10 at 23:36














up vote
-2
down vote

favorite












I am trying to clean data using Pyspark techniques with an application called Optimus, see link
https://github.com/ironmussa/Optimus



As you can see it uses the following lines of code to transform the data:



df
.rows.sort("product","desc")
.cols.lower(["firstName","lastName"])
.cols.date_transform("birth", "new_date", "yyyy/MM/dd", "dd-MM-YYYY")
.cols.years_between("birth", "years_between", "yyyy/MM/dd")
.cols.remove_accents("lastName")
.cols.remove_special_chars("lastName")
.cols.replace("product","taaaccoo","taco")
.cols.replace("product",["piza","pizzza"],"pizza")
.rows.drop(df["id"]<7)
.cols.drop("dummyCol")
.cols.rename(str.lower)
.cols.apply_by_dtypes("product",func,"string", data_type="integer")
.cols.trim("*")
.show()


Can someone let me know what the equivalent commands would be in SQL?










share|improve this question





















  • How far have you gotten before posting this question?
    – cricket_007
    Nov 10 at 23:36












up vote
-2
down vote

favorite









up vote
-2
down vote

favorite











I am trying to clean data using Pyspark techniques with an application called Optimus, see link
https://github.com/ironmussa/Optimus



As you can see it uses the following lines of code to transform the data:



df
.rows.sort("product","desc")
.cols.lower(["firstName","lastName"])
.cols.date_transform("birth", "new_date", "yyyy/MM/dd", "dd-MM-YYYY")
.cols.years_between("birth", "years_between", "yyyy/MM/dd")
.cols.remove_accents("lastName")
.cols.remove_special_chars("lastName")
.cols.replace("product","taaaccoo","taco")
.cols.replace("product",["piza","pizzza"],"pizza")
.rows.drop(df["id"]<7)
.cols.drop("dummyCol")
.cols.rename(str.lower)
.cols.apply_by_dtypes("product",func,"string", data_type="integer")
.cols.trim("*")
.show()


Can someone let me know what the equivalent commands would be in SQL?










share|improve this question













I am trying to clean data using Pyspark techniques with an application called Optimus, see link
https://github.com/ironmussa/Optimus



As you can see it uses the following lines of code to transform the data:



df
.rows.sort("product","desc")
.cols.lower(["firstName","lastName"])
.cols.date_transform("birth", "new_date", "yyyy/MM/dd", "dd-MM-YYYY")
.cols.years_between("birth", "years_between", "yyyy/MM/dd")
.cols.remove_accents("lastName")
.cols.remove_special_chars("lastName")
.cols.replace("product","taaaccoo","taco")
.cols.replace("product",["piza","pizzza"],"pizza")
.rows.drop(df["id"]<7)
.cols.drop("dummyCol")
.cols.rename(str.lower)
.cols.apply_by_dtypes("product",func,"string", data_type="integer")
.cols.trim("*")
.show()


Can someone let me know what the equivalent commands would be in SQL?







sql python-3.x apache-spark






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 10 at 22:15









user485868

32




32











  • How far have you gotten before posting this question?
    – cricket_007
    Nov 10 at 23:36
















  • How far have you gotten before posting this question?
    – cricket_007
    Nov 10 at 23:36















How far have you gotten before posting this question?
– cricket_007
Nov 10 at 23:36




How far have you gotten before posting this question?
– cricket_007
Nov 10 at 23:36

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53243961%2fhow-to-compile-equivalent-sql-code-written-in-python-spark%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53243961%2fhow-to-compile-equivalent-sql-code-written-in-python-spark%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

Museum of Modern and Contemporary Art of Trento and Rovereto