ℹ️ DAT Linux 1.0 (RC - release candidate) is out!

Introduction

DAT Linux is a Linux distribution for data science. It brings together all your favourite open-source data science tools and apps into a ready-to-run desktop environment.

It’s based on Ubuntu 22.04 “jammy jellyfish”*, so it’s easy to install and use. The custom DAT Linux Control Panel provides a centralised one-stop-shop for running and managing dozens of data science programs.

*Based on a customised Lubuntu desktop environment.



DAT Linux focuses on providing easy access and fast updates to the latest stable version of each tool or application, independent of package managers that, let’s face it, do not always have the most up to date versions.

DAT Linux is perfect for students, professionals, academics, or anyone interested in data science who doesn’t want to spend endless hours downloading, installing, configuring, and maintaining applications from a range of sources, each with different technical requirements and set-up challenges.


To get started:

⬇️ Download DAT Linux, and get on with doing data science without the headaches.

ℹ️ FAQ for more answers to some questions you may have.

💬️ Github Community channel for annoucements, or to post any feedback, issues or general queries.


List of supported data science apps:

Get the PDF version of the apps list: https://datlinux.com/About Apps.pdf

Suggest an app: https://github.com/dat-linux/community/discussions/categories/app-extra-book-suggestions


App Description
/img/birt.png
BiRT Eclipse BIRT™ is an open source reporting system for producing compelling BI reports
/img/datacleaner.png
Data Cleaner Data Quality toolkit that allows you to profile, correct and enrich your data.
/img/datasette.png
Datasette Datasette is a tool for exploring and publishing data visually and with SQL
/img/dbbrowser.png
DB Browser DB Browser for SQLite is a visual, open source tool to create, design, and edit database files compatible with SQLite
/img/dbeaver.png
DBeaver Free multi-platform database tool for developers, database administrators, analysts and all people who need to work with databases
/img/dsearch.png
D-Search Convenient interface to the “webtools” R package to search for datasets in –all– CRAN packages
/img/egit.png
E-Git EGit is an Eclipse based GUI for the Git version control system
/img/glueviz.png
Glue-viz Glue is a UI and Python library to explore relationships within and among related datasets
/img/gnumeric.png
Gnumeric Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project
/img/gnuplot.png
GNU Plot gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits
/img/gvim.png
G-Vim A GUI wraper for the Vim screen-based text editor program, with plugins for R installed
/img/ipython.png
IPython A command shell for interactive computing with a convenient console launcher
/img/julia.png
Julia Julia is a high-level, high-performance, dynamic programming language
/img/jupyter.png
Jupyter Notebook The Jupyter Notebook is a web-based interactive, scientific computing platform
/img/jupyter_lab.png
Jupyter Lab JupyterLab is the latest web-based interactive development environment for notebooks, code, and data
/img/knime.png
KNIME KNIME Analytics Platform is open source software for data science
/img/labplot.png
LabPlot Free, open source and cross-platform Data Visualization and Analysis software accessible to everyone
/img/librecalc.png
LibreOffice Calc LibreOffice Calc is the spreadsheet component of the LibreOffice software package
/img/meld.png
Meld Meld is a visual file diff and merge tool
/img/moa.png
MOA MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms
/img/openrefine.png
OpenRefine OpenRefine is an open-source desktop application for data cleanup and transformation to other formats
/img/orange.png
Orange Orange is a powerful platform to perform data analysis and visualization
/img/paraview.png
Paraview ParaView is an open-source, multi-platform data analysis and visualization application
/img/pluto.png
Pluto.jl A Pluto notebook is made up of small blocks of Julia code (cells) and together they form a reactive notebook
/img/pspp.png
PSPP GNU PSPP is a program for statistical analysis of sampled data. It is a free as in freedom replacement for the proprietary program SPSS
/img/qgis.png
QGIS A Free and Open Source Geographic Information System
/img/R.png
R R is a free software environment for statistical computing and graphics
/img/rstudio.png
R-Studio RStudio is an Integrated Development Environment (IDE) for R
/img/spyder.png
Spyder Spyder is a free and open source scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts
/img/superset.png
Superset Apache Superset is a modern, enterprise-ready business intelligence web application
/img/tabula.png
Tabula Tabula is a free tool for extracting data from PDF files into CSV and Excel files
/img/veusz.png
Veusz Veusz is a scientific plotting and graphing program with a graphical user interface, designed to produce publication-ready 2D and 3D plots
/img/visidata.png
Visidata Visidata is an interactive multitool for tabular data. It combines the clarity of a spreadsheet, the efficiency of the terminal, and the power of Python, which can handle millions of rows with ease
/img/vscodium.png
VSCodium VSCodium is a community-driven, freely-licensed binary distribution of Microsoft’s editor VS Code (ready with plugins for R/RMarkdown, Python/Jupyter, Julia)
/img/weka.png
Weka Weka is a GUI and collection of machine learning algorithms for data mining tasks
/img/wxmaxima.png
WxMaxima wxMaxima is a document based interface for the computer algebra system Maxima
/img/zeppelin.png
Zeppelin Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python, R and more