Quantcast
Channel: AWK count number of times a term appear with respect to other columns - Stack Overflow
Viewing all articles
Browse latest Browse all 3

AWK count number of times a term appear with respect to other columns

$
0
0

Given a CSV file:

id, fruit, binary1, apple, 12, orange, 03, pear, 14, apple, 05, peach, 06, apple, 1

How can i calculate for each unique values in fruit,

the number of times the binary value =1 / number of occurences of that fruit appearing in the fruit column ?

Another way to do it is to sum the value of the binary column for for each unique fruit.

For example:

For the fruit apple, it appeared with binary = 1 two times and had a frequency of 3. Hence i will get 2/3.

How can i write this in an efficient AWK code?

I know that i can do this to get unique values from the second column:

cut -d , -f2 file.csv | sort | uniq | 

or

awk '{ a[$2]++ } END { for (b in a) { print b } }' file.csv

So my non-working code looks like this:

 cat file.csv | awk '{ a[$2]++ } END { for (b in a) if ($3==1) {sum+=$3} END {print $0 sum}'

and

awk '{ a[$2]++ } END { for (b in a) { sum +=1 } }' file.csv

need help in correcting my syntax and merging the 2 awk codes


Viewing all articles
Browse latest Browse all 3

Latest Images

Trending Articles





Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>
<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596344.js" async> </script>