<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Data | 2i2c</title><link>https://deploy-preview-614--2i2c-org.netlify.app/tag/data/</link><atom:link href="https://deploy-preview-614--2i2c-org.netlify.app/tag/data/index.xml" rel="self" type="application/rss+xml"/><description>Data</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 30 Nov 2025 00:00:00 +0000</lastBuildDate><image><url>https://deploy-preview-614--2i2c-org.netlify.app/media/sharing.png</url><title>Data</title><link>https://deploy-preview-614--2i2c-org.netlify.app/tag/data/</link></image><item><title>Supporting NASA Openscapes Champions with Cloud Infrastructure</title><link>https://deploy-preview-614--2i2c-org.netlify.app/blog/nasa-openscapes-champions-2025/</link><pubDate>Sun, 30 Nov 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-614--2i2c-org.netlify.app/blog/nasa-openscapes-champions-2025/</guid><description>&lt;p>
&lt;a href="https://deploy-preview-614--2i2c-org.netlify.app/collaborators/openscapes/" >Openscapes&lt;/a> ran a NASA Champions program in November, bringing 30 participants together to learn about NASA Earthdata and the earthaccess Python library. We provided JupyterHub infrastructure for hands-on breakout sessions - a good example of using shared infrastructure to facilitate learning and collaboration in remote events.&lt;/p>
&lt;p>They used their JupyterHub for co-working, where participants practiced streaming techniques for accessing cloud data without downloading. Multiple NASA Data Centers (NSIDC, ORNL, ASDC, PO.DAAC) collaborated to co-teach using the shared environment, succeeding despite the event happening the day after a government shutdown.&lt;/p>
&lt;p>They also used this to grow the OpenScapes community by getting attendees to join their slack and sign up for
&lt;a href="https://openscapes.org/events/2025-12-15-earthaccess-hackday/" target="_blank" rel="noopener" >their December Earth Access hack day&lt;/a>. It&amp;rsquo;s a great example of leveraging shared community infrastructure to help newcomers learn quickly and join a science community.&lt;/p>
&lt;p>Read their
&lt;a href="https://openscapes.org/blog/2025-11-27-nasa-champions-2025-summary/" target="_blank" rel="noopener" >full event summary&lt;/a> to learn how they structured the program and engaged their community.&lt;/p></description></item><item><title>Fixing the mybinder.org usage analytics archive</title><link>https://deploy-preview-614--2i2c-org.netlify.app/blog/mybinder-analytics-fix/</link><pubDate>Tue, 14 Oct 2025 00:00:00 +0000</pubDate><guid>https://deploy-preview-614--2i2c-org.netlify.app/blog/mybinder-analytics-fix/</guid><description>&lt;p>The analytics archive at &lt;code>archive.analytics.mybinder.org&lt;/code> powers the
&lt;a href="https://hub.jupyter.org/binder-data/" target="_blank" rel="noopener" >mybinder.org usage dashboards&lt;/a> and provides a
&lt;a href="https://github.com/jupyterhub/binder-data" target="_blank" rel="noopener" >daily-published dataset&lt;/a> that researchers and communities use to understand how Binder is being used across different domains and scientific communities.&lt;/p>
&lt;p>While updating our
&lt;a href="https://deploy-preview-614--2i2c-org.netlify.app/blog/mybinder-analytics-fix/../binder-report-q3/" >quarterly Binder impact report&lt;/a>, we discovered the archive index page had stopped updating. The analytics publisher was writing index files to temporary storage before uploading to Google Cloud Storage, but for some reason the upload step stopped working. We
&lt;a href="https://github.com/jupyterhub/mybinder.org-deploy/pull/3462" target="_blank" rel="noopener" >deployed a fix&lt;/a> that eliminates the temporary files entirely - the code now generates the HTML index as a string in memory and uploads directly.&lt;/p>
&lt;figure id="figure-the-mybinderorg-analytics-archivehttpsarchiveanalyticsmybinderorg-shows-a-list-of-daily-usage-reports-that-anybody-can-download">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="./featured.png" alt="The [mybinder.org analytics archive](https://archive.analytics.mybinder.org) shows a list of daily usage reports that anybody can download." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
The
&lt;a href="https://archive.analytics.mybinder.org" target="_blank" rel="noopener" >mybinder.org analytics archive&lt;/a> shows a list of daily usage reports that anybody can download.
&lt;/figcaption>&lt;/figure>
&lt;p>Fortunately, we didn&amp;rsquo;t lose any data! Thanks to some smart design decisions, the daily analytics files were being collected properly the entire time, only the index page listing them was broken. You can find
&lt;a href="https://archive.analytics.mybinder.org" target="_blank" rel="noopener" >the full archive here&lt;/a>.&lt;/p>
&lt;h2 id="learn-more">
Learn more
&lt;a class="header-anchor" href="#learn-more">#&lt;/a>
&lt;/h2>&lt;ul>
&lt;li>
&lt;a href="https://github.com/jupyterhub/mybinder.org-deploy/pull/3462" target="_blank" rel="noopener" >Pull request with the fix&lt;/a>&lt;/li>
&lt;li>
&lt;a href="https://hub.jupyter.org/binder-data/" target="_blank" rel="noopener" >mybinder.org usage dashboards&lt;/a>&lt;/li>
&lt;li>The
&lt;a href="https://github.com/jupyterhub/binder-data" target="_blank" rel="noopener" >&lt;code>binder-data/&lt;/code> repository&lt;/a> is where we aggregate and publish archive data to be more accessible.&lt;/li>
&lt;li>
&lt;a href="https://deploy-preview-614--2i2c-org.netlify.app/blog/mybinder-analytics-fix/../binder-report-q3/" >Our quarterly impact report from mybinder.org&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="acknowledgements">
Acknowledgements
&lt;a class="header-anchor" href="#acknowledgements">#&lt;/a>
&lt;/h2>&lt;ul>
&lt;li>Thanks to the
&lt;a href="https://deploy-preview-614--2i2c-org.netlify.app/collaborators/jupyterhub/" >JupyterHub community&lt;/a> for their collaboration on mybinder.org infrastructure&lt;/li>
&lt;/ul></description></item><item><title>Enforcing per-user storage quotas with `jupyterhub-home-nfs`</title><link>https://deploy-preview-614--2i2c-org.netlify.app/blog/per-user-storage-quota/</link><pubDate>Tue, 28 Jan 2025 09:57:28 +0000</pubDate><guid>https://deploy-preview-614--2i2c-org.netlify.app/blog/per-user-storage-quota/</guid><description>&lt;p>When sharing a storage disk between users, as is usually the case in a JupyterHub deployment, it is important to put in guardrails so that one user cannot eat up the whole storage capacity from the rest of the users.
To this end, 2i2c in close collaboration with
&lt;a href="https://developmentseed.org" target="_blank" rel="noopener" >Development Seed&lt;/a> have developed the
&lt;a href="https://github.com/2i2c-org/jupyterhub-home-nfs" target="_blank" rel="noopener" >&lt;code>jupyterhub-home-nfs&lt;/code> project&lt;/a> which is a Helm chart that permits enforcing per-user quotas on the storage space.&lt;/p>
&lt;div class="alert alert-note">
&lt;div>
Note that this feature is currently available to AWS hosted hubs only and will be rolled out to other cloud providers in the future.
&lt;/div>
&lt;/div>
&lt;p>Under the hood, the Helm chart runs
&lt;a href="https://github.com/nfs-ganesha/nfs-ganesha" target="_blank" rel="noopener" >NFS Ganesha&lt;/a> as an in-cluster NFS server, backed by
&lt;a href="https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-xfs" target="_blank" rel="noopener" >XFS&lt;/a> as the underlying filesystem. Storage quota is enforced through XFS&amp;rsquo;s native quota management utility &lt;code>xfs_quota&lt;/code>.&lt;/p>
&lt;p>Since this feature moves our infrastructure away from managed filesystems (such as AWS&amp;rsquo;s Elastic File System) that cannot support per-user storage quotas, we have also developed monitoring and alerting mechanisms that will let us know when the disks are getting full, and automated back-ups for disaster recovery.&lt;/p>
&lt;p>If you would like to try this on your 2i2c-managed hub,
&lt;a href="https://docs.2i2c.org/support" target="_blank" rel="noopener" >please get in touch&lt;/a>.&lt;/p>
&lt;p>This project can also be used with &lt;em>any&lt;/em> Kubernetes-based JupyterHub, as per our
&lt;a href="https://2i2c.org/right-to-replicate/" target="_blank" rel="noopener" >Right to Replicate policy&lt;/a>, so please try it out on your own deployment and let us know what you think!&lt;/p>
&lt;h2 id="acknowledgements">
Acknowledgements
&lt;a class="header-anchor" href="#acknowledgements">#&lt;/a>
&lt;/h2>&lt;p>This project was developed and deployed in collaboration with
&lt;a href="https://developmentseed.org/team/tarashish-mishra/" target="_blank" rel="noopener" >Tarashish Mishra&lt;/a> from
&lt;a href="https://deploy-preview-614--2i2c-org.netlify.app/collaborators/devseed/" >Development Seed&lt;/a>, funded through the
&lt;a href="https://deploy-preview-614--2i2c-org.netlify.app/collaborators/nasa-veda/" >NASA VEDA project&lt;/a>.&lt;/p></description></item></channel></rss>