打基础~

珍妮王小钰 · 发表于 2019-2-6 03:16:18

78 tongji
85 概率

87 python：
SAS有online tutorial可以高效自学，SQL有SQLZOO可以练手https://www.1point3acres.com/bbs/thread-138105-1-1.htmlDA要求的Python和CS的不太一样。要用熟Pandas, Numpy，Matplotlib, SciKit-Learn这几个包
正在上edX的6001网课，讲算法的，用的Python，针对的是python和CS零基础的，很多练习和作业，感觉设计的特别好，听了一个月下来，感觉进步很大。
https://zhuanlan.zhihu.com/p/24546514

Udacity上的CS101. 里边内容都是用Python教的。没一个知识点都紧跟着Python coding练习。我学下来感觉还是非常好的。如果有一定基础可以看看Learn Python the hard way。里边的code example我觉得还是非常好的！

建议R的学习者查看DataCamp上的练习和教程。
对于Python学习者的建议是，尝试用《笨办法学Python》学习的同时看视频做练习。建议初学者查看Rodeo（被称为“Python到数据科学IDE”）。
Python入门的话推荐edx上MIT的那门6001x，打基础不错。

edX MIT 6001x and 60002xhttps://www.edx.org/course/introduction-to-computer-science-and-programming-using-python-0
https://www.edx.org/course/introduction-to-computational-thinking-and-data-science-2

珍妮王小钰 · 发表于 2019-2-7 07:17:52

ML:Multinomial distribution Classification
• What is mood of person in current minute? M={Happy, Sad}• Measure his/her actions every ten seconds: A={Cry, Jump, Laugh, Yell}
Data (D): {LLJLCY, JJLYJL, CCLLLJ,JJJJJJ}Bias (??): Probability table

LLJLCY： happy? .5X.5X.3X.5X.1= sad?:

10分:
beta分布介绍

相信大家学过统计学的都对正态分布二项分布均匀分布等等很熟悉了，但是却鲜少有人去介绍beta分布的。

用一句话来说，beta分布可以看作一个概率的概率分布，当你不知道一个东西的具体概率是多少时，它可以给出了所有概率出现的可能性大小。

简单的解释：

https://www.cnblogs.com/shixisheng/p/7197623.html?utm_source=itdadao&utm_medium=referral

MAP /MLE:

极大似然假设与贝叶斯估计的区别：
最大似然估计只考虑某个模型能产生某个给定观察序列的概率。而未考虑该模型本身的概率。这点与贝叶斯估计区别。
Map与ML的区别：
最大后验估计是根据经验数据获得对难以观察的量的点估计。与最大似然估计类似，但是最大的不同时，最大后验估计的融入了要估计量的先验分布在其中。故最大后验估计可以看做规则化的最大似然估计。
极大后验MAP什么时候等于极大似然ML
不知道关于假设的任何概率，所有的hi假设拥有相同的概率，then MAP is Maximum Likelihood (hML极大似然假设），如果数据量足够大，最大后验概率和最大似然估计趋向于一致。
Map和朴素贝叶斯有什么关系
If independent attribute condition is satisfied, then vMAP = vNB 如果独立的属性条件是满足的vmap=vNB

、请描述极大似然估计 MLE 和最大后验估计 MAP 之间的区别。请解释为什么 MLE 比 MAP更容易过拟合。
MLE：取似然函数最大时的参数值为该参数的估计值，ymle=argmax[p(x|y)]；MAP：取后验函数（似然与先验之积）最大时的参数值为该参数的估计值，ymap=argmax[p(x|y)p(y)]。因为MLE 只考虑训练数据拟合程度没有考虑先验知识，把错误点也加入模型中，导致过拟合。

珍妮王小钰 · 发表于 2019-2-7 08:29:40

即使学过机器学习的人，对机器学习中的MLE(极大似然估计)、MAP(最大后验估计)以及贝叶斯估计(Bayesian)仍有可能一知半解。对于一个基础模型，通常都可以从这三个角度去建模，比如对于逻辑回归（logistics Regression）来说：

MLE: logistics Regression

MAP: Regularized Logistics Regression

Bayesian: Bayesian Logistic Regression#

-----------------------------

multiple multi-variate probabli:

# acts=4 tunes=5 weather=7

how many??:

# prob entries= 4x5x7=140

# params= (of classes)2x139(likeli 1 class)+ (2-1)prior=279

features: acts , tunes ,weahter

#params= classesx Πi（values featuresj)-1

params to estimate likelihood: 2x(4-1)+2X(5-1)+2X(7-1)

#classnumber#actvals tunes weather

benefit of naive bayes: very fast learning and classifying.

珍妮王小钰 · 发表于 2019-2-11 13:14:41

87:数据类型~~

往下：shift+tab 执行
不动：ctrl+回车。

type(1.2-1): float(int,,,) 7%2 1 10%2 0
type("q") str ('a') ("hello'qin'...") 里面有‘ “用三引号”””。。。。“”“里面可换行噢。
print("\”\b”)
=””b

int("1")+1
2

bool:
type(TRue):
1<2+1
true
(1<2)+1:
1

变量：

tip: a,b=1,2赋值
for 循环
a+=1
a=a+1(计算逻辑是等号的右边先计算）（1+1=2被重新赋值到a)

PYthon 三大数据结构。

珍妮王小钰 · 发表于 2019-2-12 00:33:55

列表增删查改(索引[],Append\)
list=

num=[1,2,3]
num
out: [1,2,3]
sum(num)
out:6
len(num)
out:3(个组合的集合

slice #查
num[0]
out:1
num[2]=num[len(num)-1]
out:3
num[-1]倒数第一个
num[-2]倒数第二个（2）
num[0:1]
num[0:2]
output：[1,2]
num[0:3]
output：[1,2,3]

num[0:]
[1,2,3]
num[1:]
[2,3]

num[:2]
[1,2]

num[-2:]
[2,3]

slice求和： num[0]+num[1]

#shift+tab : L.insert(index,object)
增#插入的索引位置，插入的值
num insert()
num.insert(1,4)
num
[1,4,2,3]
num.insert(4,5)不如 num.append(6)好
.append()尾巴只能一个，而这样可以多个: num+[8,9]

删：num.pop(1)
num
out:
num.pop()默认删除最后一个

改：
num[3]=4(把位置的东东改）num[4]=5

二维：
b=[[1,2],[3,4]]
--------------
name=['','','']
sex=['','','']
age=[12,14,15]
#嵌套：
info=[name,sex,age]
info
out:[['','',''],[],[]]
info[0][1]# name里面的第一个'XX'

row=[0]*3
row
out:[0,0,0]
[row]*4
put:[[0,0,0],[0,0,0],[0,0,0],[0,0,0]]
列表

list.append(var) #列表尾添加
list.insert(index,var) #列表插入
list.extend(list1) #列表连接

list.remove(var) #删除var
list.pop(var) #删除最后一个

list.count(var) #计算个数

list.index(var) #找到第一个就返回索引

list.sort() #排序，assic首字母来排序
list.reverse() #倒序

珍妮王小钰 · 发表于 2019-2-12 14:20:03

#set集合化(&)去重功能    a=[1,2,3,3,4]

   set(a)
   out:{1,2,3,4}
------------------
#这样可以索引
   list(set(a))[1] )
#&交集，|并集

& |
   set(a)&set(b)
   out:{2,3}
   set(a)|set(b)
   out: {1，2，3，4}
#差集
set(a) -set(b)
out

1)
set(b) -set(a)
out：(4)
[1,2]>a out: false
--------------------------------------
字典：
a={'name':'yangxi','age':18}
a.keys()
out:dict_keys(['name', 'age'])
但： a.keys()[]会报错
list(a.keys())
out:['name', 'age']
list(a.values())

Out[7]:['yangxi', 18]
------------------
a.items()

Out[9]:
dict_items([('name', 'yangxi'), ('age', 18)])list(a.items())out:[('name', 'yangxi'), ('age', 18)]# 元组不可修改list(a.items())[0][1]out:'yangxi'list(a.items())[0]out

'name', 'yangxi')list(a.items())[0][1]out:'yangxi'a.get('nam',1)获取！单纯查找out1

1

a.get('name',1)

out: 'yangxi'

a.setdefault('sex','female')查找后给赋值
'female'aout；{'name': 'yangxi', 'age': 18, 'sex': 'female'}字典 set{a}$set{b}字典d.get(key,0) #同d[key]d.has_key(key) #有该键值返回true,否则falsed.keys() #返回字典键值列表d.values() #列表的形式返回字典的值，返回时候无顺序d.update(dict2) #增加合并字典d.setdefault('stuid',1123)d.pop(key)d.popitem() #得到一个pair,并从字典中删除它，已空则抛出异常d.clear() #清空字典，同del dictd.copy() #拷贝字典d.cmp(dict1,dict2) #比较字典，（优先级为元素个数，键大小，键值大小）第一个大返回1，小-1，一样0d.items() #变成列表，每个元素为一个元组

珍妮王小钰 · 发表于 2019-2-12 15:02:01

a=10
if a >10:
  print('more than 10')
elif a==10:
  print('equal to 10')
else :
  print('less than 10')

珍妮王小钰 · 发表于 2019-2-19 01:37:18

a=10
if a >10:
  print('more than 10')
elif a==10:
  print('equal to 10')
else :
  print('less than 10')---------------------------
a=10
if a >10:
  print('more than 10')
elif a>8:
  print('more than 8')
elif a>6 :
  print('more than 6')
out: more than 8.
--------------------------------------
if a>10:
print('more than 10')
if a%2==0:
      print('ou')
else:
      print('no')
else:
print('less than 10')

count=0
while count<10:
print('the num is:',count)
count=count+1#输出然后累加

--->out:
the num is: 0the num is: 1the num is: 2the num is: 3the num is: 4the num is: 5the num is: 6the num is: 7the num is: 8the num is: 9
count=0while count<10:
count=count+1
print('the num is:',count)
--->out:
the num is: 1the num is: 2the num is: 3the num is: 4the num is: 5the num is: 6the num is: 7the num is: 8the num is: 9the num is: 10

冒号：是属于block控制模块下面的。

count=0while count<10:
count=count+1
print('the num is:',count)
if count==5:
空格break
--->out:
the num is: 1the num is: 2the num is: 3the num is: 4the num is: 5

count=0
while count<10:
count=count+1
print('the num is:',count)
if count==5:continue

the num is: 1the num is: 2the num is: 3the num is: 4the num is: 5the num is: 6the num is: 7the num is: 8the num is: 9the num is: 10

Break是粗暴的终止，而continue是过滤掉中间一步
count=0
while count<10:
count=count+1
if count==5:
      continue（5时候跳过了执行下面）
print('the num is:',count)
--->out:

the num is: 1the num is: 2the num is: 3the num is: 4the num is: 6the num is: 7the num is: 8the num is: 9the num is: 10

while里面:count=count+1/2 按1，2 累加
for i in range(1,10):
print(i)
123456789

珍妮王小钰 · 发表于 2019-2-19 23:34:16

for i in range(10):
print(i)
0123456789

for i in range(1,10):
print(i)
123456789

for i in range(5,10): print(i)56789

shift + tab ~~~~~~~~~~~~~~~range(start,stop[, step])

for i in range(5,20,2):
  空格print (i)

5791113151719

a=1for i in range(10): a=a+i print(a)1247111622293746
列表循环!a=['a','b','c']for i in a:    print(i)abca[0],a[1],a[2]-->out: ('a', 'b', 'c')进阶：拼接a=['a','b','c']b='str:'for i in a:    print(b,i)
字典循环!dict={'a':1,'b':3,'c':'abc'}dict.keys()-->out: dict_keys(['a', 'b', 'c'])dict.values()-->out: dict_values([1, 3, 'abc'])for k in dict.keys(): print(k)
abcfor k in dict.values(): print(k)
13abc
for k in dict.items(): print(k)
('a', 1)('b', 3)('c', 'abc')
for k in dict.items(): print(k[0])元组的第一个元素abc
for k in dict.items(): print(k[1])元组的第二个元素13abc------------dict.items()out:dict_items([('a', 1), ('b', 3), ('c', 'abc')])
改进~~for 在python 是多值循环。for k,v  in dict.items(): print(k) print(v)out:a1b3cabc再改进~~for k,v  in dict.items(): print(k,v)out：a 1b 3c abc

珍妮王小钰 · 发表于 2019-2-19 23:58:33

list=[]
for i in range(1,101):
list.append(i)

改进版！！
list=[i for i in range(1,101)]
list
--------
list=[]
for i in range(1,101):
if i%2== 0:

空格！！list.append(i)
list
------------
list=[i for i in range(1,101)if i%2==0]
list
out：[2,
4, 6, 8,，，，100]list=[i**2 for i in range(1,101)if i%2==0]----------------------------------------------list=['str'+str(i) for i in range(1,101)if i%2==0]list'str2', 'str4', 'str6', 'str8', 'str10', 'str12', 'str14', 'str16', 'str18',,,'str100'list=['str'+str(i) for i in range(1,101)if (i%2==0)&(i%3==0)]
['str6', 'str12', 'str18', 'str24', 'str30',,,,

打基础~

所属分类: 职业发展

正在浏览此版块的会员 ()