Skip to main content

Linux Basics

aifare platform instances use Linux (Ubuntu distribution) as the default operating system. Mastering basic Linux commands is essential for efficient AI development and model training. Below are commonly used commands and typical scenarios on the platform.

File and Directory Operations

List Files/Directories

  • ls: List files and directories in the current directory
  • ls -l: Show detailed information (permissions, owner, size, time, etc.)
ls
ls -l

Create/Switch Directory

  • mkdir: Create a new directory
  • cd: Change directory
mkdir data_dir
cd data_dir
cd ../data_dir # Enter data_dir under the parent directory

View Current Path

  • pwd: Display the current working directory
pwd

Rename/Move File or Directory

  • mv: Move or rename
mv old_name new_name
mv file.txt /data/

Copy Files/Folders

  • cp: Copy files
  • cp -r: Recursively copy folders
cp file.txt /data/
cp -r myfolder /user-data/

Delete Files/Folders

  • rm -rf: Recursively and forcibly delete
rm -rf temp_dir
rm -rf /data/* # Delete all contents under /data

Environment Variable Settings

  • export: Set environment variables
export PATH=/opt/miniconda3/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
  • View environment variables:
env | grep PATH
  • Global effect: Write the export command into ~/.bashrc, then run source ~/.bashrc

Text Editing

  • It is recommended to use the vim editor. For more advanced usage, refer to related tutorials.

Compression and Decompression

  • zip/unzip: Compress and decompress in zip format
  • tar: General compression/decompression tool
zip -r data.zip /data/
unzip data.zip

tar czf data.tar.gz /data/
tar xzf data.tar.gz

View GPU Information

  • nvidia-smi: View GPU status, memory usage, driver version, etc.
nvidia-smi

Process Management

  • ps -ef: View all processes
  • kill -9 PID: Force kill a process
ps -ef | grep python
kill -9 12345

View CPU/Memory Usage

  • top: Real-time view of CPU, memory, and process resource usage
top

Log Redirection and Background Running

  • >: Redirect logs to a file
  • 2>&1: Merge standard output and error output
  • &: Run in the background
python train.py > train.log 2>&1 &
cat train.log

Common Scenario Examples

1. GPU Memory Not Released

  • Phenomenon: The program has stopped but GPU memory is still occupied
  • Solution: Use ps -ef to find residual processes, kill -9 to kill them, then check memory with nvidia-smi

2. Data/Model Sharing Across Instances

  • Requirement: Save models or data to the /user-data directory for sharing across multiple instances
cp -r model.pth /user-data/

3. Process Killed Due to Memory Overuse

  • Phenomenon: The process is terminated by the system with a "Killed" message
  • Solution: Use top to check memory usage, optimize code, or upgrade the instance configuration

4. Run Daemon Process in JupyterLab Terminal

  • Requirement: Logs can still be viewed after closing the web page
  • Solution: Redirect logs to a file and run in the background
python train.py > train.log 2>&1 &

For more Linux tips, please refer to the aifare platform documentation or community resources.