Let's say I want to compare two csv files: file A and file B. They are both similarly built - the first column has product IDs (one product per row) and the columns provide some stats about the products such as sales in # and $.
I want to compare these files - see which product IDs appear in the first column of file A and not in B, and which in B and not A.
Finally, it would be very great if the result could be written into two new CSV files - one product ID per row in the first column. (no other data in the other columns needed)
This is the script I tried:
import csv
#open CSV's and read first column with product IDs into variables pointing to lists
A = [line.split(',')[0] for line in open('Afile.csv')]
B = [line.split(',')[0] for line in open('Bfile.csv')]
#create variables pointing to lists with unique product IDs in A and B respectively
inAnotB = list(set(A)-set(B))
inBnotA = list(set(B)-set(A))
print inAnotB
print inBnotA
c = csv.writer(open("inAnotB.csv", "wb"))
c.writerow([inAnotB])
d = csv.writer(open("inBnotA.csv", "wb"))
d.writerow([inBnotA])
print "done!"
But it doesn't produce the required results.
It prints IDs in this format:
247158132n
and nothing to the csv files.