Looping a function through Pandas dataframe with iterrows
My goal here is to transform the data in one dataframe and output the results to a new dataframe. Here's what I have so far, using a simplified dataframe:
import math
import pandas as pd
data = {'A':[1,4,3,5,7],'B':[0,6,3,0,2],'C':[1,1,3,0,4]} #sample data
df = pd.DataFrame(data)
transDF = pd.DataFrame() #empty dataframe for results
def Chord(y): #Chord transformation function
ySUM = sum(a*a for a in y)
ySUMsqrt = math.sqrt(ySUM)
yPRIME = []
for a in y:
RESULT = a/ySUMsqrt
yPRIME.append(RESULT)
return yPRIME
for Yi, row in df.iterrows(): #my attempt at a loop
Yrow = df.loc[df.index == Yi]
y = yRow.values.tolist()
tfRow = float(Chord(y))
transDF = transDF.append(tfRow)
The function itself works if I just feed it a list, but when I try the loop I get an error that says "can't multiply sequence by nonint of type 'list'". I've tried modifying my loop as many different ways as I can think of, but at this point I'm out of thoughts. I would greatly appreciate any insight!
2 answers

IIUC, I don't think need iterrows for this problem.
import math data = {'A':[1,4,3,5,7],'B':[0,6,3,0,2],'C':[1,1,3,0,4]} #sample data df = pd.DataFrame(data) transDF = pd.DataFrame() #empty dataframe for results def Chord(y): #Chord transformation function ySUM = sum(a*a for a in y) ySUMsqrt = math.sqrt(ySUM) yPRIME = [] for a in y: RESULT = a/ySUMsqrt yPRIME.append(RESULT) return yPRIME transDF = df.apply(Chord) print(transDF)
Output:
A B C 0 0.1 0.000000 0.19245 1 0.4 0.857143 0.19245 2 0.3 0.428571 0.57735 3 0.5 0.000000 0.00000 4 0.7 0.285714 0.76980

Your code is really inefficient. Looping over rows in pandas is almost always unnecessary and looping over single elements should even more rare.
Make use of numpys vectorisation!
import pandas as pd import numpy as np def chord_transform(row): return row / np.sum(row**2) data = {'A':[1,4,3,5,7],'B':[0,6,3,0,2],'C':[1,1,3,0,4]} #sample data df = pd.DataFrame(data) df_chord = df.apply(chord_transform, axis=1)