Home Linux Commandline Tips How to Split Large Text File into Multiple *.txt Files

How to Split Large Text File into Multiple *.txt Files

As we have mentioned numerous times in the earlier article covered; whether directly or indirectly, it remains a valid statement that the computing depth of a Linux operating system cannot be matched with the strides of other operating systems.

Its open-source nature creates an unseen level of transparency for the end-users. While other operating systems provide the start button for baking a cake, Linux allows us to play with the cake ingredients as we move towards the final product.

This article will seek to explore the visible Linux-oriented steps for splitting a large text file into multiple smaller text files. This tutorial falls under the Linux file management segment.

One of the reasons why you might need to break a large text file into a smaller file is to meet set memory requirements. The large file might not fully fit in a removable media but splitting it makes it easy to transfer in bits.

Problem Statement

We will create a sample text file called large_file.txt to reference throughout this tutorial.

$ sudo nano large_file.txt
Create Large File in Linux
Create Large File in Linux

We are going to look at several useful Linux-based methodologies that will help us break the above large text file into multiple small text files. Smaller file transfers over a network are usually faster hence speeding up the network performance due to parallel transfers.

Using the Linux split Command

The split command is part of the GNU Coreutils package and primarily splits an input file into multiple smaller files.

The syntax for the usage of the split command is as follows:

$ split [OPTION]... [FILE [PREFIX]]

The split utility is associated with several useful command options as per its man page ($ man split). The default size of the file to split is 1000 lines. The split file takes a default suffix (x) and a default prefix (aa).

$ split large_file.txt
Split Large File into Multiple Files
Split Large File into Multiple Files

We only see one file split because the original text file has less than 1000 lines i.e. 49 lines, hence we logically created its duplicate.

$ wc -l large_file.txt
Count File Lines in Linux
Count File Lines in Linux

To retain the .txt file extension after file splitting, we will use the command option --additional-suffix.

$ split --additional-suffix=.txt large_file.txt
Retain File Extension While Splitting File
Retain File Extension While Splitting File

Splitting File by Specifying Number of Lines

Let’s say we want to split this large text file into smaller ones with 12 lines each, we will use the -l command option to specify the line number split we want.

$ split -l 12 --additional-suffix=.txt large_file.txt
Split File By Line Number
Split File By Line Number

The 49 lined large_file.txt has been split into 5 smaller files each with a maximum of 12 lines.

$ cat xaa.txt | wc -l; cat xab.txt | wc -l; cat xac.txt | wc -l; cat xad.txt | wc -l; cat xae.txt | wc -l
List Count of Lines in File
List Count of Lines in File

Splitting File by Specifying Resulting File Sizes

Our file has a file size of 170 bytes.

$ ls -l large_file.txt
List File Size in Linux
List File Size in Linux

To split it into 30 bytes smaller files, we will use the -b command option.

$ split -b 30 --additional-suffix=.txt large_file.txt

The command has generated 6 smaller files with a maximum file size of 30 bytes each.

$ ll -lh
Split File By Sizes
Split File By Sizes

Splitting File by Specifying a Prefix

Let us for instance assume we need the 30 bytes split files above to have the prefix large_file.log, we would implement the following command.

$ split -b 30 --additional-suffix=.txt large_file.txt large_file.log
Split File by Prefix
Split File by Prefix

Splitting File by Using Numeric Prefix

If you want the split files prefixes to be associated with numeric numbers like 00, 01, or 02 and not letters like aa, ab, or ac, implement the command with the -d command option.

$ split -d -b 30 --additional-suffix=.txt large_file.txt large_file_log
Split File by Number Prefix
Split File by Number Prefix

We can now comfortably split a large text file into multiple smaller files while retaining the .txt file extension in Linux.

Ravi Saive
I am an Experienced GNU/Linux expert and a full-stack software developer with over a decade in the field of Linux and Open Source technologies. Founder of TecMint.com, LinuxShellTips.com, and Fossmint.com. Over 150+ million people visited my websites.

Each tutorial at UbuntuMint is created by a team of experienced writers so that it meets our high-quality standards.

Was this article helpful? Please add a comment to show your appreciation and support.

1 thought on “How to Split Large Text File into Multiple *.txt Files”

  1. I have a Question to split a big .txt file that is to be split and saved with the dynamic file names.

    input file:

    aaaa
    Start-KEYWORD(OutputFIleName1)
    1a1a1a [1st file recrod start]
    :
    2a2a2a [1st file recrod ends]
    end-KEYWORD.
    Start-KEYWORD(OutputFIleName2)
    3a3a3a [2nd file recrod start]
    :
    4a4a4a [2nd file recrod ends]
    end-KEYWORD.
    Start-KEYWORD(OutputFIleName3)
    5a5a5a [3rd file recrod start]
    :
    6a6a6a [3rd file recrod ends]
    end-KEYWORD.
    :
    etc..

    Output should be – OutputFIleName1.txt.

    1a1a1a [1st file recrod start]
    :
    2a2a2a [1st file recrod ends]

    OutputFIleName2.txt
    3a3a3a [2nd file recrod start]
    :
    4a4a4a [2nd file recrod ends]

    OutputFIleName3.txt
    5a5a5a [3rd file recrod start]
    :
    6a6a6a [3rd file recrod ends]

    This needs to be done with shell scripts. Can anyone please suggest what to do.

    Reply

Leave a Reply to Karthik Cancel reply

Thanks for choosing to leave a comment. Please keep in mind that all comments are moderated according to our comment policy, and your email address will NOT be published or shared. Please Do NOT use keywords in the name field. Let's have a personal and meaningful conversation.