Hike News


和鲁迅先生八十年前谈论的“拿来主义”不同,这么多年过去了,时过境迁,我们早已习惯了这种“拿来”的方式。在这个意义下,鲁迅先生应该欣慰才是。商品外形设计、电影创意、甚至汽车尾灯的颜色,都被我们一一拿来,有的偷偷地放在自家的东西中,脸皮稍微厚点的甚至会借着观众的无知,自诩为“民族产业”、“自我创新”。时间久了,甚至连自己都被说服,相信某个灵感真的是自我孕育、厚积薄发的产物。 推心而论,中国四五十年来的发展确实瞩目。与此同时,快速发展的需求让我们学会了、甚至熟练使用“拿来主义”。在追赶领跑者的过程中,这样的方式能够短时间内弥补和竞争对手的差距,但是从深层次上,却伤害了我们自身主体的创造力。越能轻易地拿来,越容易让拿来者丧失提升自我的动力。久而久之,甚至像吸食毒品一样,在自我满足中堕落不前。...

Continue Reading →

Python Programs With External Modules to Spark

It is a common scenario that we need external modules in a PySpark program. Three alternatives could be employed here:

  1. Distribute the third-party modules across your spark cluster. This is the easiest way, but needs the administrative right of the cluster;
  2. Write your own functions in a single module and append it to the search path of SparkContext. Two utility functions are available: PySpark sc.addFile and sc.addPyFile.
  3. Package the module with multiple python files into a single .zip or .egg file. Refer to these answers elsewhere:


Useful IDEA Shortcuts on Mac OS X

IntelliJ is my excellent partner in programming and project review. I make up this list for daily referrence and share with community. Golden shortcut: CMD + Shift + A FIND Find action by name: CMD + Shift + A Find class or file by name: CMD + Shift + O Show the list of available intention actions: ALT + Enter Find text in the project or in the...

Continue Reading →

R Packages You Should Put Under Your Pillow

R在统计分析、机器学习、以及绘图上有着丰富的功能,基础安装包里的函数能够满足基本的需求,如果需要更多样化、复杂的数据处理,可以试着使用以下工具: 数据清洗转换 (Data wangling) DescTools (Tools for describing data and descriptive statistics) dplyr (面向data.frame,plyr的下次迭代,让R具有流式数据处理的风格) plyr (有用的ddply函数,参考http://www.r-bloggers.com/a-fast-intro-to-plyr-for-r/) reshape (数据变形的基本操作,丰富但底层) reshape2 (功能强大的melt和cast数据融合函数,reshape简版...

Continue Reading →

Top Tricks to Make Your Zotero More Powerful

Zotero is a free, easy-to-use tool to help you collect, organize, cite, and share your research sources. It support a wide range of documentation types from papers to presentations, from web pages to notes and drafts. A SQLite database and online storage (yet with limited space) are used to store and index the citation information of each resour...

Continue Reading →

© 2017 InnoTrek All Rights Reserved.
Theme by hiero