天天看點

上周技術關注:軟體的力量

  • [軟體; 比賽; 象棋; Intel] 軟體的力量 #

    Deep Juinor 其實是已經是 4 次冠軍了。本來運作在 4 路 AMD Opteron 系統上的,效能大約是每秒計算 600 萬步,million nodes per second (MNPS),在 2 路 Intel Woodcrest 上,能達到 820 萬步-- Woodcrest 确實是很強。用 Intel Compiler 重新搗鼓一下,提升到 840 萬步每秒,再用 Intel Compiler 的 profile-guided optimizations 優化後,這個 Woodcrest 怪物達到了 900 萬步每秒,相比 AMD 系統的 baseline 真的提高了 50%。

  • [安全] Whitepapers #

    這些論文是Honeynet項目的成果。它們讨論了入侵者團體的工具,手段和動機。

  • [計算機科學] Classic Texts In Computer Science #

    計算機科學中的經典論文

  • [dojo; ajax] Dojo.Book中文版(第一章) #

    随着web2.0的熱潮,google,yahoo等各大web供應商争先恐後的退出自己的ajax開發包,意圖争奪ajax标準.究竟鹿死誰手?有人看好Yui,有人喜歡GWT,還有人對這些都嗤之以鼻,認為自己寫javascript才是王道.好了,誰優誰劣暫放一邊,現在我要給大家的是由IBM研發的dojo開發工具(toolkit)的開發首冊,由我來翻譯.

  • [web2.0; 資料庫技術] Database War Stories #9 (finis): Brian Aker of MySQL Responds #

    I didn't hear that flat files don't scale. What I heard is that some very big sites are saying that traditional databases don't scale, and that the evolution isn't from flat files to SQL databases, but from flat files to sophisticated custom file systems. Brian acknowledges that SQL vendors haven't solved the problem, but doesn't seem to think that anyone else has either.

  • [web2.0; 資料庫技術] Database War Stories #8: Findory and Amazon #

    Our read-only databases are flat files -- Berkeley DB to be specific -- and are replicated out using our own replication management tools to our webservers. This strategy gives us extremely fast access from the local filesystem. We make thousands of random accesses to this read-only data on each page serve; Berkeley DB offers the performance necessary to be able to still serve our personalized pages rapidly under this load.

  • [web2.0; 資料庫技術] Database War Stories #7: Google File System and BigTable #

    Greg Linden of Findory wrote: 'I've been enjoying your series on O'Reilly Radar about database war stories at popular startups. I was thinking that it would be fantastic if you could get Jeff Dean or Adam Bosworth at Google to chat a little bit about their database issues. As you probably know, Jeff Dean was involved designing BigTable and the Google File System. Adam Bosworth wrote a much discussed post about the need for better, large scale, distributed databases.'

  • [web2.0; 資料庫技術] Database War Stories #6: O'Reilly Research #

    In building our Research data mart, which includes data on book sales trends, job postings), blog postings, and other data sources, Roger Magoulas has had to deal with a lot of very messy textual data, transforming it into something with enough structure to put it into a database. In this entry, he describes some of the problems, solutions, and the skills that are needed for dealing with unstructured data.

  • [web2.0; 資料庫技術] Database War Stories #5: craigslist #

    Do Not expect FullText indexing to work on a very large table. It's just not fast enough for what user expect on the web and an updating rows will make bad things happen. We want forward facing queries to be measured in a few 100ths of a second.

  • [web2.0; 資料庫技術] Database War Stories #4: NASA World Wind #

    Patrick Hogan of NASA World Wind, an open source program that does many of the same things as Google Earth, uses both flat files and SQL databases in his application. Flat files are used for quick response on the client side, while on the server side, SQL databases store both imagery (and soon to come, vector files.) However, he admits that 'using file stores, especially when a large number of files are present (millions) has proven to be fairly inconsistent across multiple OS and hardware platforms.'

  • [web2.0; 資料庫技術] Database War Stories #3: Flickr #

    I also asked for any information on the scale of data Flickr manages and its growth rates. Cal answered:total stored unique data : 935 GBtotal stored duplicated data : ~3TB

  • [web2.0; 資料庫技術] Database War Stories #2: bloglines and memeorandum #

    Bloglines has several data stores, only a couple of which are managed by 'traditional' database tools (which in our case is Sleepycat). User information, including email address, password, and subscription data, is stored in one database. Feed information, including the name of the feed, description of the feed, and the various URLs associated with feed, are stored in another database. The vast majority of data within Bloglines however, the 1.4 billion blog posts we've archived since we went on-line, are stored in a data storage system that we wrote ourselves. This system is based on flat files that are replicated across multiple machines, somewhat like the system outlined in the Google File System paper,but much more specific to just our application. To round things out, we make extensive use of memcached to try to keep as much data in memory as possible to keep performance as snappy as possible.

  • [web2.0; 資料庫技術] Web 2.0 and Databases Part 1: Second Life #

    In this first installment, a few thoughts from Cory Ondrejka and Ian Wilkes of Linden Labs, creators of Second Life

  • [數學; 算法; 人物] 數學之美 系列八-- 賈裡尼克的故事和現代語言處理 #

    賈裡尼克和波爾,庫克以及拉維夫對人類的另一大貢獻是 BCJR 算法,這是今天數字通信中應用的最廣的兩個算法之一(另一個是維特比算法)。有趣的是,這個算法發明了二十年後,才得以廣泛應用。IBM 于是把它列為了 IBM 有史以來對人類最大貢獻之一,并貼在加州 Amaden 實作室牆上。遺憾的是 BCJR 四個人已經全部離開 IBM,有一次IBM 的通信部門需要用這個算法,還得從斯坦福大學請一位專家去講解,這位專家看到 IBM 櫥窗裡的成就榜,感慨萬分。

更多技術動态,請通路我的365Key(RSS),你可以通過365Key訂閱。