"records = [json.loads(line) for line in open(path)]"
"The probability density function for lognorm is: lognorm.pdf(x, s) = 1 / (s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0. lognorm takes s as a shape parameter. The probability density above is defined in the “standardized” form. To shift and/or scale the distribution use the loc and scal"
"def get_top_amounts(group, key, n=5): totals = group.groupby(key)['contb_receipt_amt'].sum() # Order totals by key in descending order return totals.order(ascending=False)[-n:]"
"return totals.order(ascending=False)[:n]"
"TypeError: pivot_table() got an unexpected keyword argument 'rows'"
"从0开始,步长1和-1出现的概率相等。通过内置的random模块以纯python的方式实现1000步的随机漫步: In [1]: import random In [2]: position=0 In [3]: walk=[position] In [4]: steps=1000 In [5]: for i in xrange(steps): ...: step=1 if random.randint(0,1) else -1 ...: position += step ...: walk.append(position) ...: 我用np.random模块一次性随机产生1000个“掷硬币的"
Supplemental textbook for Darcy's course "Data Programming with Python". Pandas and matplotlib libraries are discussed in detailed context, as well as data managment in the beginning, and great illustrations are provided. Useful tool books in data field.