- 最后登录
- 2018-6-9
- 在线时间
- 215 小时
- 寄托币
- 169
- 声望
- 69
- 注册时间
- 2015-11-21
- 阅读权限
- 20
- 帖子
- 67
- 精华
- 0
- 积分
- 261
- UID
- 3673323
- 声望
- 69
- 寄托币
- 169
- 注册时间
- 2015-11-21
- 精华
- 0
- 帖子
- 67
|
My boss just finished the BA program of Queen's. You can choose to take courses at DT Toronto, not necessary to go to Kingston. Queen's courses have some contents of Big Data technology, it introduces different big data platforms including Apache Spark, Hadoop, Scala, but just on the level of getting some basic concepts, which is still very far away from being a data scientist.
If you wanted to do data science, from the perspective of programming language, Python would be the first choice. SAS is very user friendly (although I think its macro logic does not make a lot sense), but it is more a tool for analysis. SQL is the database management tool, R is more statistic oriented. The power of Python is it allows you to program on the big-data platform, which the other two languages cannot easily achieve. With Python, you could build something called data pipeline to do one-stop analysis, from extracting lively data, cleaning data, aggregate table to the data visulization in your reporting dashboard. I used to see the visual chart made by Under Armor data scientist team using Python, it is very fancy, way better than any point-and-click visualization tool like Tableau.
I can generally introduce the structure of analytics/BI team in the company. For me, I am doing operatational analytics, we have a team of about 60 people to cover the field service KPI reporting and ad-hoc analysis. All of our analysts' title is Business Intelligence. Firstly, we have ETL team which will extract, transpose and load data from different database and generate aggregate table. Most of them are technicial background, their job is more like data science. Unfortunately, although the outside world has been changed so much, our ETL is still based on SAS and SQL server. We are always tortured by the slow speed of server if the dataset was too big. I used to see how fast big data tool like Spark is. They run 10 Gigbyte tables with Spark, and it only takes 20 min, which is inimaginable with our server.
Then we have analysis team that are doing all the KPI reporting and ad-hoc analysis. They are using SAS, but not that much. They use a lot of Excel, VBA, PPT, Cube developed by ETL and visualization tools (we use Tableau in our company). Then our team is between ETL and analysts. We do not get in touch too much with server or extract data, but we use the raw table to generate new aggregate tables. So the major part of our job is programming in SAS and conducting data mining.
Honestly speaking, data science is really cool job but analysts are not that much fancy. It pretty much like any other analysts except you need to use different tools (maybe some programming) because the size of data you analyze is larger than before. Another I want to mention is the reporting analyst is not well-paid job, I knew quite a few people with that role and the average salary is only around 23-25 per hour. BI is not bad, of course data scientist would be the best in payment. |
-
总评分: 声望 + 10
查看全部投币
|