Dominik Bieber @dbeaver

0 Beiträge0 Beteiligte0 Beiträge heute

**Olivier D'Hondt** @tyldurd@framapiaf.org · 3. Apr.

Olivier D'Hondt @tyldurd@framapiaf.org

#dask is strange. Sometimes using the dask counterpart to numpy functions or arrays makes computations slower. Sometimes not. Also, lots of variability in runtime. #python #dataengineering

Fortgeführter Thread

**Tatu Leppämäki** @tadusko@mstdn.social · 28. März

28. März

Tatu Leppämäki @tadusko@mstdn.social

Thank you to #Kone & Mai and Tor Nessling Foundations for supporting this work. A quantitative work like this would not be possible without a robust suite of FOSS tools. My thanks to the maintainers of #QGIS, #pandas, #geopandas, #duckdb, #dask, #statsmodels, #jupyter and many more!

**EuroSciPy** @EuroSciPy@fosstodon.org · 22. März

22. März

EuroSciPy @EuroSciPy@fosstodon.org

Working on solutions for large-scale #ScientificComputing?

#EuroSciPy2025 wants your original research on parallel and distributed computing with #Python!

Submit your breakthrough approaches to scaling scientific workloads as tutorials, talks, or posters:

https://pretalx.com/euroscipy-2025/cfp

EuroSciPy
Kraków 2025
European Conference on Python in Science
18. - 22. August 2025
AGH University of Kraków, Poland
Call for Proposals

#DistributedComputing #PythonScience #ScientificPython

**Joseph Szymborski** @jszym@cosocial.ca · 19. Feb.

19. Feb.

Joseph Szymborski @jszym@cosocial.ca

#DuckDB (and a tonne of RAM) have absolutely saved my behind these last few months while dealing with huge biological datasets.

If you do any #data munging at all on a daily basis, well worth picking up DuckDB. Don't let the DB part fool you, it's more like #Dask or #Spark but #SQL .

https://duckdb.org/

DuckDBAn in-process SQL OLAP database management systemDuckDB is an in-process SQL OLAP database management system. Simple, feature-rich, fast & open source.

Fortgeführter Thread

**b-long** @blong@fosstodon.org · 24. Nov. 2024

24. Nov. 2024

b-long @blong@fosstodon.org

I've improved my StackOverflow question and added a bounty. I'm once again asking the amazing #python , #dask , and #Django community if you could offer some of your knowledge to me and the world I suppose this might just be a #Dask question, but I am boosting it to reach out to anyone that might lend a hand https://stackoverflow.com/q/79198230

Stack OverflowDjango + Dask integration: usage and progress?About performance & best practice Note, the entire code for the question below is public on Github. Feel free to check out the project! https://github.com/b-long/moose-dj-uv/pull/3 I'm trying...

**b-long** @blong@fosstodon.org · 17. Nov. 2024

17. Nov. 2024

b-long @blong@fosstodon.org

Would any of the wonderful #python , #dask , or #Django people have a few minutes to spare helping me with a performance question? Our community is so wonderful and I'm so grateful for you all

https://stackoverflow.com/questions/79198230/django-dask-integration-how-to-do-more-with-less

Stack OverflowDjango + Dask integration: How to do more with less?Note, the entire code for the question below is public on Github. Feel free to check out the project! https://github.com/b-long/moose-dj-uv/pull/3 I'm trying to workout a simple Django + Dask

**Virgile Andreani** @Armavica@fosstodon.org · 18. Sept. 2024

18. Sept. 2024

Virgile Andreani @Armavica@fosstodon.org

I am moving all my computing libraries to #xarray, no regrets. It is a natural way to manipulate datasets of rectangular arrays, with named coordinates and dimensions: https://xarray.dev/
There are several possible backends, including #dask which allows lazy data loading.
I had the pleasure of meeting some of the devs last week, who showed me a preview of the upcoming `DataTree` structure which is going to make this library even more versatile!

xarray.devXarray: N-D labeled arrays and datasets in Python

#Python #numpy #ScientificComputing

**Kathy Reid** @KathyReid@aus.social · 14. Feb. 2024

14. Feb. 2024

Kathy Reid @KathyReid@aus.social

Me: Groks #dask, teaches herself @matplotlib pretty fluent in @pandas_dev and #python.

Also me: signs an index incorrectly, spends 2 hours debugging a list index out of range error before spotting it

Now imagine how this #scales with tools like #Copilot and #GenerativeAI #coding tools ...

**Kathy Reid** @KathyReid@aus.social · 27. Jan. 2024

27. Jan. 2024

Kathy Reid @KathyReid@aus.social

As part of my #PhD work, I recently had to perform computation on two very large files using @pandas_dev and I turned to #dask - a set of libraries on top of #pandas, aimed at scaling #python workloads from the laptop to the cluster.

Here's what I learned!

https://blog.kathyreid.id.au/2024/01/27/scaling-python-dask/

#DataScience #jupyter #ParallelProcessing

**Michael Aye** @michaelaye@mastodon.online · 18. Jan. 2024

18. Jan. 2024

Michael Aye @michaelaye@mastodon.online

#python #geodata It's so convenient these days to have libraries like #xarray and #rioxarray that can open huge image mosaic files with 45k x 29k pixels in a virtual fashion automagically, using #dask under the hood. Just looking up a few hundred pixels using xarray logic and add a `.compute()` at the end to get the result. So cool. thanks to all those devs to make it work so nice!

**Kathy Reid** @KathyReid@aus.social · 2. Jan. 2024

2. Jan. 2024

Kathy Reid @KathyReid@aus.social

Good morning folks! It's been a while since I did one of my #TwitterMigration #Introduction #ConnectionList posts where I curate interesting people for you to follow on the #Fediverse

Today, I'd like you to meet:

@LMonteroSantos Lola is a #PhD #researcher at #EUI interested in #data #regulation, digital #economy and #AntiTrust, passionate about #DataScience and #programming. New to Mastodon, please make welcome

@danlockton is a #Professor at @TUEindhoven where he works in #design, #imagination and #climate #futures. He often posts interesting things around co-design and #collaboration

@1sabelR is a #researcher @ANUResearch where she is into #SolarPunk and @scicomm She co-hosts the #SciBurst #podcast - worth a listen!

@timrichards is a #travel #writer based in #Naarm / #Melbourne in Australia, specialising in #rail

@microstevens is a #DataScience facilitator at #UWMadison and she works in #OpenScience and #genomics

@mrocklin does amazing things with #dask in #python, and I am very grateful in recent weeks for his posts and #StackOverflow responses. Thank you

@everythingopen is Australia's premier open #technology conference, covering #linux, #OpenSource, #OpenData, #OpenGov, #OpenGLAM, #OpenScience and everything else open. You should check it out!

That's all for today - don't forget to share your own lists so we can more richly connect the and curate the conversations we want to have

**Kathy Reid** @KathyReid@aus.social · 1. Jan. 2024

1. Jan. 2024

Kathy Reid @KathyReid@aus.social

My #dask coding worked and I got my data! I have been trying to get this data for three weeks

Today's job is to manually validate it.

**Kathy Reid** @KathyReid@aus.social · 28. Dez. 2023

28. Dez. 2023

Kathy Reid @KathyReid@aus.social

Random #research idea while babysitting a #dask process:

- I wonder if there's a way to save a bunch of terms of service documents to be able to version them, and show how they have changed over time, particularly in respect to arbitration, copyright and other #datafication processes?

**Kathy Reid** @KathyReid@aus.social · 26. Dez. 2023

26. Dez. 2023

Kathy Reid @KathyReid@aus.social

Why yes I partitioned a 1Gb file into 2000 partitions with #dask, why do you ask?

**Kathy Reid** @KathyReid@aus.social · 22. Dez. 2023

22. Dez. 2023

Kathy Reid @KathyReid@aus.social

I'm taking my first foray into #dask - have done the tutorial and read what I can in Stack Overflow.

It's definitely a steep learning curve, but it's been very interesting so far.

@holden's excellent book has been very useful so far, and I think the more I work with it, the more I will master the nuances - how to set up the Client scheduler with optimum workers and threads, the optimum partitions etc.

**jonny (good kind)** @jonny@neuromatch.social · 27. Sept. 2023

27. Sept. 2023

jonny (good kind) @jonny@neuromatch.social

So im almost finished with my first independent implementation of a standard and I want to write up the process bc it was surprisingly challenging and I learned a lot about how to write them.

I was purposefully experimenting with different methods of translation (eg. Adapter classes vs. pure functions in a build pipeline, recursive functions vs. flattening everything) so the code isnt as sleek as it could be. I had planned on this beforehand, but two major things I learned were a) not just isolating special cases, but making specific means to organize them and make them visible, and b) isolating different layers of the standard (eg. schema language is separate from models is separate from I/O) and not backpropagating special cases between layers.

This is also my first project thats fully in the "new style" of python thats basically a typed language with validating classes, and it makes you write differently but uniformly for the better - it's almost self-testing bc if all the classes validate in an end-to-end test then you know that shit is working as intended. Forcing yourself to deal with errors immediately is the way.

Lots more 2 say but anyway we're like 2 days of work away from a fully independent translation of #NWB to #LinkML that uses @pydantic models + #Dask for arrays. Schema extensions are now no-code: just write the schema (in nwb schema lang or linkml) and poof you can use it. Hoping this makes it way easier for tools to integrate with NWB, and my next step will be to put them in a SQL database and triple store so we can yno more easily share and grab smaller pieces of them and index across lots of datasets.

Then, uh, we'll bridge our data archives + notebooks with the fedi for a new kind of scholarly communication....

**Netherlands eScience Center** @eScienceCenter@akademienl.social · 21. Juni 2023

21. Juni 2023

Netherlands eScience Center @eScienceCenter@akademienl.social

Learn how to make your #Python code perform faster during our interactive Parallel #Programming #Workshop, and solve practical problems using #Dask, #Numba, and #Snakemake.
Register now!
https://www.esciencecenter.nl/event/parallel-programming-in-python-3/

**Michael Aye** @michaelaye@mastodon.online · 20. Jan. 2023

20. Jan. 2023

Michael Aye @michaelaye@mastodon.online

super short #dask question: Do the scheduler and client need to run with the same python version (or even conda env) than the code I want to use for production? Or is it independent?

Frühere Suchanfragen

Suchoptionen

Verwaltet von:

Serverstatistik:

#Dask