Yeah, my resume right now is Python + SQL (does everything for my job) and a dozen pieces of expensive software that my boss insists we must make use of.
Im starting to use python in everything I can and got to finally get the Python SQL connection work, unfortunately can’t get it to work in a jupyter notebook or in an interactive python, so I have to run the scripts, wait 10 minutes and then fix any error or do another step on the process. Funnily enough, is importing the pandas library the bottleneck of the process, the SQL queries are somehow fast in comparison.
I can’t help much on the sql connector end (they’re black magic and I’m grateful to the sql gods every time they work). But I have a couple of small tips, one if you have CRUD access on the sql server, it’s useful to make a small table to use as a way to debug the import script quicker, if no crud access then using top or limit in your sql query would do the trick. Another thing, you may already be doing this, but in your script where you pull data from sql it’s generally better to write to a parquet file rather than something like csv. It preserves the pandas dataframe structure which can be slow to cteate.
Also, you can definitely get sql connectors to work in a jupyter notebook, but probably not worth the effort. It’s not like on the data pull stage you’re doing any exploratory work anyway.
The problem really is that the SQL server doesn’t accept connections from my pc, so I had the IT guys install an embedded version of Python in a server that could connect to the SQL server, then from VS code im selecting that Python as the kernel, it’s only work as scripting because everytime I run it it use that external Python to run the script in a terminal. On the other hand, the SQL scripts I run them on some Microsoft sql server application, and the Python I just wanted it to do different queries and then process the queries with pandas and export them formatted on excel or wharever.
Yeah, my resume right now is Python + SQL (does everything for my job) and a dozen pieces of expensive software that my boss insists we must make use of.
Im starting to use python in everything I can and got to finally get the Python SQL connection work, unfortunately can’t get it to work in a jupyter notebook or in an interactive python, so I have to run the scripts, wait 10 minutes and then fix any error or do another step on the process. Funnily enough, is importing the pandas library the bottleneck of the process, the SQL queries are somehow fast in comparison.
I can’t help much on the sql connector end (they’re black magic and I’m grateful to the sql gods every time they work). But I have a couple of small tips, one if you have CRUD access on the sql server, it’s useful to make a small table to use as a way to debug the import script quicker, if no crud access then using top or limit in your sql query would do the trick. Another thing, you may already be doing this, but in your script where you pull data from sql it’s generally better to write to a parquet file rather than something like csv. It preserves the pandas dataframe structure which can be slow to cteate.
Also, you can definitely get sql connectors to work in a jupyter notebook, but probably not worth the effort. It’s not like on the data pull stage you’re doing any exploratory work anyway.
The problem really is that the SQL server doesn’t accept connections from my pc, so I had the IT guys install an embedded version of Python in a server that could connect to the SQL server, then from VS code im selecting that Python as the kernel, it’s only work as scripting because everytime I run it it use that external Python to run the script in a terminal. On the other hand, the SQL scripts I run them on some Microsoft sql server application, and the Python I just wanted it to do different queries and then process the queries with pandas and export them formatted on excel or wharever.