Split Lines/sentence With Over 10 Words Where The First Comma Appears
I have the following code that splits the line every 10 words.      #!/bin/bash  while read line do counter=1;     for word in $line     do         echo -n $word' ';     if (($coun
Solution 1:
A better approach is to use awk and test for 15 or more words and if so, just substitute a ",\n" for a ", ", e.g.
awk 'NF >= 15 {sub (", ", ",\n")}1' file
Example Use/Output
With your input in file, you would have:
$ awk 'NF >= 15 {sub (", ", ",\n")}1' file
phrase from a test line,
which I want to split, and I don't know how.
(if you have a large number of lines, awk will be orders-of-magnitude faster than a shell loop)
Solution 2:
I am not sure if you want to split over 10 words or 15 words.
Simply replace the 10 with 15 in case you are dealing with 15 words.
awk -v OFS=, 'NF>10{ sub(/, */, ",\n", $0); print }' input.txt
or more clearly:
#! /bin/bash
awk -v OFS=, 'NF > 10{
    # enter this block iff words > 10
    # replace first occurence of , and additional space,
    # if any, with newline
    sub(/, */, ",\n", $0)
    print
}' input.txt
Solution 3:
Here is a simple solution which check number of word in a string. if number of words in a string are more than 10 then, it will split:
output = []
s = 'phrase from a test line, which I want to split, and I dont know how'whilelen (s.split()) > 10:
    first_sent,s = s.split(',',1)
    output.append(first_sent)
output.append(s)
Solution 4:
This is a simple version of the question for loop in bash simply prints n times the command instead of reiterating
The simple version can be handled with
# For each line with 10 words append a newline after the first comma
sed -r '/((\w)+ ){10}/s/,/,\n/'input.txt
Post a Comment for "Split Lines/sentence With Over 10 Words Where The First Comma Appears"