How to Split Large Text File into Multiple *.txt Files

As we have mentioned numerous times in the earlier article covered; whether directly or indirectly, it remains a valid statement that the computing depth of a Linux operating system cannot be matched with the strides of other operating systems.

Its open-source nature creates an unseen level of transparency for the end-users. While other operating systems provide the start button for baking a cake, Linux allows us to play with the cake ingredients as we move towards the final product.

This article will seek to explore the visible Linux-oriented steps for splitting a large text file into multiple smaller text files. This tutorial falls under the Linux file management segment.

One of the reasons why you might need to break a large text file into a smaller file is to meet set memory requirements. The large file might not fully fit in a removable media but splitting it makes it easy to transfer in bits.

Problem Statement

We will create a sample text file called large_file.txt to reference throughout this tutorial.

$ sudo nano large_file.txt
Create Large File in Linux
Create Large File in Linux

We are going to look at several useful Linux-based methodologies that will help us break the above large text file into multiple small text files. Smaller file transfers over a network are usually faster hence speeding up the network performance due to parallel transfers.

Using the Linux split Command

The split command is part of the GNU Coreutils package and primarily splits an input file into multiple smaller files.

The syntax for the usage of the split command is as follows:

$ split [OPTION]... [FILE [PREFIX]]

The split utility is associated with several useful command options as per its man page ($ man split). The default size of the file to split is 1000 lines. The split file takes a default suffix (x) and a default prefix (aa).

$ split large_file.txt
Split Large File into Multiple Files
Split Large File into Multiple Files

We only see one file split because the original text file has less than 1000 lines i.e. 49 lines, hence we logically created its duplicate.

$ wc -l large_file.txt
Count File Lines in Linux
Count File Lines in Linux

To retain the .txt file extension after file splitting, we will use the command option --additional-suffix.

$ split --additional-suffix=.txt large_file.txt
Retain File Extension While Splitting File
Retain File Extension While Splitting File

Splitting File by Specifying Number of Lines

Let’s say we want to split this large text file into smaller ones with 12 lines each, we will use the -l command option to specify the line number split we want.

$ split -l 12 --additional-suffix=.txt large_file.txt
Split File By Line Number
Split File By Line Number

The 49 lined large_file.txt has been split into 5 smaller files each with a maximum of 12 lines.

$ cat xaa.txt | wc -l; cat xab.txt | wc -l; cat xac.txt | wc -l; cat xad.txt | wc -l; cat xae.txt | wc -l
List Count of Lines in File
List Count of Lines in File

Splitting File by Specifying Resulting File Sizes

Our file has a file size of 170 bytes.

$ ls -l large_file.txt
List File Size in Linux
List File Size in Linux

To split it into 30 bytes smaller files, we will use the -b command option.

$ split -b 30 --additional-suffix=.txt large_file.txt

The command has generated 6 smaller files with a maximum file size of 30 bytes each.

$ ll -lh
Split File By Sizes
Split File By Sizes

Splitting File by Specifying a Prefix

Let us for instance assume we need the 30 bytes split files above to have the prefix large_file.log, we would implement the following command.

$ split -b 30 --additional-suffix=.txt large_file.txt large_file.log
Split File by Prefix
Split File by Prefix

Splitting File by Using Numeric Prefix

If you want the split files prefixes to be associated with numeric numbers like 00, 01, or 02 and not letters like aa, ab, or ac, implement the command with the -d command option.

$ split -d -b 30 --additional-suffix=.txt large_file.txt large_file_log
Split File by Number Prefix
Split File by Number Prefix

We can now comfortably split a large text file into multiple smaller files while retaining the .txt file extension in Linux.

Got something to say? Join the discussion.

Have a question or suggestion? Please leave a comment to start the discussion. Please keep in mind that all comments are moderated and your email address will NOT be published.