Skip to main content

Command Palette

Search for a command to run...

Getting Started with Linux

Updated
10 min read
Getting Started with Linux
R

Data Engineer based in Jakarta, Indonesia. When I first started out in my career, I was all about becoming a Software Engineer or Backend Engineer. But then, I realized that I was actually more interested in being a Data Practitioner. Currently focused on data engineering and cloud infrastructure. In my free time, I jog and running as a hobby, listening to Jpop music, and trying to learn the Japanese language.

What is Linux?

Linux is an Operating System (OS) that is freely distributable under the licensed using the General Public License (GPL) by GNU hat was first released by Linus Torvalds. In 1998, big corporations like IBM and Oracle said they would support the Linux platform and started working on it in a big way. Linux was growing increasingly popular in the early 2000s, especially in the server industry. Distributions like Red Hat and Mandrake were also making it easier to use.

Architecture of Linux

The kernel, system library, hardware layer, system, and shell functions make up the majority of the Linux architecture.

Kernel

The kernel is the central component of the Linux operating system. It takes care of things like CPU scheduling and memory management. There are two main ways that the CPU functions in Linux: kernel mode and user mode.

User mode and Kernel mode are two different working states in a computer system which determines the level of access and control.

  • Kernel mode

The kernel mode is the privileged mode where the core part of the operating system the kernel executes. It can access system resources such device drivers, hardware control, and memory management without restricted access.

The system moves the CPU into kernel mode in order to execute program that needs hardware access. Control goes back to user mode after execution.

  • User mode

The normal mode, known as user mode, is where the process has restricted access. The OS establishes a distinct process and assigns it its own memory space when a program launches. Programs operating in user mode must make system calls to the kernel in order to request access to hardware or kernel memory.

Shell

The shell is the command-line interface that allows users to interact with the operating system. The shell receiving command and send to the kernel, which then performs the requested actions. Each command runs only after the previous command has finished, unless the execution flow is altered by spesific operators.

There are different types of shell for linux systems:

  • The Bourne Shell(sh)

    The Bourne again shell was created by by Steve Bourne was the first default shell on Unix systems.

  • The Bourne Again Shell(bash)

    The Bourne again shell was developed at AT&T Bell Labs by Steve Bourne in late 1970 for general purposes and all kinds of users. Most of linux distributions like Ubuntu, Debian, and Fedora's use this type of shell. It uses the prompt # for the root user and $ for the non-root users.

  • The C Shell(csh)

    The C shell was created at the University of California by Bill Joy in late 1970 to improve interactive use and mimic the C language. On systems like Red Hat, instead of C shell, you will find its extended version, Tcsh as shell. It uses the prompt # for the root user and % for the non-root users.

  • The Korn Shell(ksh)

    The Korn shell was developed at AT&T Bell Labs by David Korn in early 1980 to improve the Bourne shell. It uses the prompt # for the root user and $ for the non-root users.

  • The Z Shell(zsh)

    The Z shell was created at Princeton University by Paul Falstad in 1990 as an extension for the Bourne shell. The Z shell become the default shell for macOS. It uses the prompt # for the root user and % for the non-root users.

Top 5 Linux Shells You Should Know | Image create by author

Applications

An application is a computer program that provides functions to carrying out a particular task or activity. These pre-written code collections offer shared features and APIs that other applications and system utilities can utilize. They make it possible for programs to communicate with the kernel and carry out operations like memory management and file processing.

Hardware

Hardware layer of Linux is the lowest level of operating system consists of all peripheral devices (RAM/HDD/CPU…etc). The hardware layer is in charge of communicating with the different hardware components and giving the remainder of the operating system access to them. It also covers I/O activities, memory management, CPU control, device drivers, and kernel tasks.

Utilities

Utilities are the commend line tools that carry out several activities that users give to improve system administration and management. These utilities enables user to perform variety of tasks, such as file management, system monitoring, network configuration, user administration, and other duties. The init system's functions include managing system processes during runtime and initializing user space at boot.

Linux Operating System | https://zitoc.com/wp-content/uploads/2019/04/Linux-Architecture.png

Linux Distribution Family Tree

A Linux distribution, sometimes known as a distro, is an operating system that consists of a package management system, a set of tools and software, and a Linux kernel.

Typically, Linux distributions are obtained from the maintainer.

The Tree of Linux is a collection of software programs and their ties to one another.

In 2025, there are over 600 active Linux distributions available, it could be challenging to choose the one that best fits your needs.

Some of the major Linux distributions are:

  • Debian

    Debian was initially released on 16th August 1993 with a strong emphasis on free software. Debian is a powerful, independent Linux distribution and also the foundation for many other distros.

  • Fedora

    Fedora Linux was originally developed in 2003 as a continuation of the Red Hat Linux project. Fedora-based Linux distributions include the Fedora official editions that customize the user experience and various third-party remixes offer specialized features like gaming or scientific computing.

  • Open SUSE

    openSUSE released the first version as SUSE Linux in 1994 and was opened up to the community in 2005, which marked the creation of openSUSE. openSUSE is an open-source community project with Linux-based distributions that is sponsored by SUSE Software Solutions Germany GmbH and other companies.

  • Ubuntu

    Ubuntu was first released in 2004 by Canonical Ltd, a UK-based company dedicated to promoting the use of open-source software. One of the most popular and widespread distros for general use, ubuntu is built on top of the Debian which provides a large repository of software and a stable, tested foundation.

  • Red Hat Enterprise Linux (RHEL)

    In 2002 Red Hat began releasing Red Hat Enterprise Linux based on Red Hat Linux, but with a much more conservative release cycle and a subscription based support program. RHEL is a commercial operating system developed by Red Hat for businesses and enterprise use, known for its stability, security, and long-term support.

  • SUSE Linux Enterprise (SLES)

    SLES was developed based on SUSE Linux by a small team and was first released on October 2000. SUSE Linux Enterprise is a stable, flexible, and secure enterprise-grade Linux operating system from the German company SUSE, designed for servers, mainframes, and desktops in business environments.

  • Linux Mint

    Linux Mint is a community-developed Linux distribution based primarily on Ubuntu and was created in August 2006. Linux Mint is built upon the foundational Linux families, inheriting its architecture from Ubuntu and Debian.

Linux Distribution Family Structure | https://microsoft.github.io/WhatTheHack/020-LinuxFundamentals/Student/resources/images/linuxkernel-distros.png

The Filesystem Hierarchy Standard (FHS)

The Filesystem Hierarchy Standard (FHS), which specifies the directory layout and contents in Linux distributions, is adhered to by the Linux filesystem. In the Linux file system hierarchy, the root directory is the highest directory, represented by the forward slash (/). It acts as the system's starting point for all other folders and files.

Exploring the /bin directory: Essential Command Binaries

The /bin ("binary") directory contains essential command binaries in the Linux file system hierarchy that are required for the system to boot and run in single-user mode.

Exploring the /dev directory: Device Files

The /dev directory holds device files that represent hardware devices, like disk drives, USB devices, keyboards, and printers. For example, /dev/sda might represent the first hard drive, and /dev/null is a special file that discards all data written to it.

Exploring the /etc directory: Configuration Files

The /etc directory contains system-wide configuration files and shell scripts used by programs and services.

Some important files and directories found in /etc include:

  • /etc/passwd: Stores user account information

  • /etc/shadow: Stores encrypted user passwords

  • /etc/hosts: Maps hostnames to IP addresses

  • /etc/fstab: Specifies file system mount points

  • /etc/init.d/: Contains scripts for starting and stopping system services

  • /etc/default/: Stores default configuration settings for various services

Exploring the /var directory: Variable Data

The /var ("variable") directory contains variable files that are expected to grow over time, such as log files, mail spools, print queues, and temporary files that do not fit in /tmp.

Some common subdirectories within /var include:

  • /var/log: Stores system log files

  • /var/spool: Contains spool directories for services like email, print queues, and cron jobs

  • /var/tmp: Holds temporary files that may be preserved across system reboots

  • /var/cache: Stores cached data for applications and services

  • /var/lib: Holds persistent data for various applications and services

Exploring the /usr directory: Unix System Resources

The /usr directory contains the majority of user-space applications and files, including system libraries, documentation, and binaries for installed software.

Some common subdirectories within /usr include:

  • /usr/bin: Contains binary executable files for user applications and utilities

  • /usr/sbin: Holds system administration binaries for the root user

  • /usr/lib: Stores shared libraries required by various programs

  • /usr/share: Contains architecture-independent data files, such as documentation, icons, and localization files

  • /usr/include: Provides header files used for software development

  • /usr/local: Intended for locally installed software and files

Exploring the /home directory: User Home Directories

The /home directory contains personal directories for all users on the system. For example, if your username is “datarunner01,” your home directory would be /home/datarunner01.

Exploring the /lib and /lib64 directory: Shared Library Modules

The /lib and /lib64 directory contains shared libraries (similar to .dll files in Windows) required by the binaries in /bin and /sbin. Some common libraries found in these directories include:

  • libc.so: The C standard library, which provides essential functions for C programs

  • libm.so: The math library, containing mathematical functions

  • libpthread.so: The POSIX threads library, used for multi-threaded programming

  • libz.so: The zlib compression library, used for data compression and decompression

Exploring the /mnt and /media directory: Temporary Mount Filesystems and Removable Media

The /mnt directory is a traditional location for manually mounting file systems, while the /media modern desktop environments and automounting systems typically use a directory to mount removable media devices automatically. For example, if you insert a USB flash drive, it may be automatically mounted to a directory like /media/usb_drive, allowing you to access and manage the files on the drive.

Exploring the /opt directory: Add-on Software Packages

The /opt ("optional") directory contains add-on application software packages. For example, installing proprietary software or applications that are not included in the default Linux distribution.

Overview of other directories

This table provides some other directories serve specific purposes:

  • /boot: required for the initial boot process, such as the kernel and boot loader configurations

  • /proc: virtual file systems that provide a way to interact with the kernel, access system information, and hardware configuration details

  • /root: home directory for administrator account

  • /run : a temporary file system with runtime data for various system services and processes

  • /srv : data for server services provided by the system

  • /sys : virtual file systems to exposes hardware/driver information

  • /tmp : temporary files storage location created by users or programs

  • /var : variable files that stores data such as log files, temporary files, and mail spool

The Linux file system is structured for organization, security, and efficiency.

The FHS figure | Source: https://gyires.inf.unideb.hu/GyBITT/20/images/ch3-fhs.png

Basic Linux CLI commands

Knowing basic linux command important for efficient file management and organization. Here’s a beginner friendly guide to some of the essential Linux commands:

  • cd: Changes the directory.

  • ls: Lists files and directories in the current directory.

  • mkdir: Creates a new directory.

  • man: Displays the manual for a command.

  • mv: Move or rename files or directories.

  • pwd: Displays the current directory path.

  • rm: Deletes a file or directory.

  • sudo: Executes a command with superuser (admin) privileges.

  • touch: Create a new file.

Summary

In the provided document, Linux is an open-source operating system that is freely distributable under the GNU General Public License (GPL). First released by Linus Torvalds, it gained popularity in the early 2000s, especially in the server industry. The architecture of Linux consists of the kernel, system library, hardware layer, and shell functions.

The Linux kernel is the central component, handling tasks like CPU scheduling and memory management. There are two main ways the CPU functions in Linux: kernel mode, which is the privileged mode for the core OS to access system resources, and user mode, which is the normal mode where a process has restricted access. The shell is a command-line interface that allows users to interact with the operating system by receiving commands and sending them to the kernel for execution. I hope you enjoyed reading this.

More from this blog

Data Engineering Blog

13 posts