A Software Ecosystem for Big Data Astronomy
I believe next-generation galaxy surveys demand the creation of high-quality software and centralized computing infrastructure. These tools should provide working astronomers the time to think more about the science they want to do and less about the details of how a large-scale analysis should be implemented. The technologies to build these kinds of tools exist throughout the private sector, but the knowledge and skills to piece them together into working systems is not generally prioritized in astronomy education.
I believe these skills are essential to the future of astronomy. Rather than trying to teach everyone to be a better engineer, we should prioritize supporting a smaller number of engineering-focused individuals who have the knowledge necessary to appreciate the unique challenges of working with astronomical datasets.
A selection of software I’ve written:
cosmap | sample the universe at scale
cosmap is a high-level tool for defining analyses that depend on repeatedly retrieving data from a large survey. Cosmap handles parameter management, data retrieval, dependency analysis, and results output. You get to focus on the astronomy.
github.com/patrickrwells/cosmap
heinlein | data management for the sky
heinlein accelerates queries on the sky. Quickly grab images, catalog data, and more from a specific location on the sky. heinlein understands the relationships between your data types and caches data so you can come back to it easily.
github.com/patrickrwells/heinlein
godata | file management for science
godata manages scientific data on disk so you don’t have to. Never worry about where your data is on disk or. Just add it to a project and grab it when you need it.
github.com/patrickrwells/godata
Languages and Technologies
Advanced Usage
Intermediate/Working Knowledge
Basic familiarity