[主题活动] 【CASK EFFECT】0910F阅读全方位锻炼--越障【SCI】 1-3 [复制链接]

草木也知愁
UID: 2487572

永久会员

Rank: 16

声望: 3963
寄托币: 23288
注册时间: 2008-1-2
精华: 50
帖子: 2141

发表于 2009-7-12 14:09:16 |显示全部楼层

本帖最后由草木也知愁于 2009-7-13 13:53 编辑

【CASK EFFECT】0910G阅读能力基础自测（速度、难度、深度、越障、真题、RAM）
https://bbs.gter.net/forum.php?mod=viewthread&tid=910464&highlight

【CASK EFFECT】0910G阅读全方位锻炼--难度【LSAT】汇总贴
https://bbs.gter.net/thread-982016-1-1.html

【CASK EFFECT】0910G阅读全方位锻炼--速度【CET】汇总贴
https://bbs.gter.net/thread-982018-1-1.html

【CASK EFFECT】0910F阅读全方位锻炼--越障【SCI】汇总贴
https://bbs.gter.net/thread-982020-1-1.html

【CASK EFFECT】0910G阅读全方位锻炼--真题【GRE】（后期推出）

【CASK EFFECT】0910G阅读全方位锻炼--深度【FICTION】（后期推出）

【CASK EFFECT】0910F阅读全方位锻炼--RAM 汇总贴（后期推出）

规则：

我每天贴出1000字左右的一篇文字（从我平时看的书或者paper里摘的）

没有别的要求，只要大家坚持读完就可以

如果你能坚持一个月，你会发现自己的阅读进化了~

[注]
1、直接在电脑屏幕面前做，虽然GRE阅读是在纸上考，但是这个过程会遏制你做笔记，同时给你的阅读造成视觉障碍，也就是把难度训练和抗干扰训练同步结合，增加效率（初期会很累，但是既然大家想要成为高手，那么就别对自己太温柔）
2、不用苛求速度，看完即可

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

ABSTRACT
The BLAST programs are widely used tools for
searching protein and DNA databases for sequence
similarities. For protein comparisons, a variety of
definitional, algorithmic and statistical refinements
described here permits the execution time of the
BLAST programs to be decreased substantially while
enhancing their sensitivity to weak similarities. A new
criterion for triggering the extension of word hits,
combined with a new heuristic for generating gapped
alignments, yields a gapped BLAST program that runs
at approximately three times the speed of the original.
In addition, a method is introduced for automatically
combining statistically significant alignments produced
by BLAST into a position-specific score matrix,
and searching the database using this matrix. The
resulting Position-Specific Iterated BLAST (PSIBLAST)
program runs at approximately the same
speed per iteration as gapped BLAST, but in many
cases is much more sensitive to weak but biologically
relevant sequence similarities. PSI-BLAST is used to
uncover several new and interesting members of the
BRCT superfamily.
INTRODUCTION
Variations of the BLAST algorithm (1) have been incorporated
into several popular programs for searching protein and DNA
databases for sequence similarities. BLAST programs have been
written to compare protein or DNA queries with protein or DNA
databases in any combination, with DNA sequences often
undergoing conceptual translation before any comparison is
performed. We will use the blastp program, which compares
protein queries to protein databases, as a prototype for BLAST,
although the ideas presented extend immediately to other
versions that involve the translation of a DNA query or database.
Some of the refinements described are applicable as well to
DNA–DNA comparison, but have yet to be implemented.
BLAST is a heuristic that attempts to optimize a specific
similarity measure. It permits a tradeoff between speed and
sensitivity, with the setting of a ‘threshold’ parameter, T. A higher
value of T yields greater speed, but also an increased probability
of missing weak similarities. The BLAST program requires time
proportional to the product of the lengths of the query sequence
and the database searched. Since the rate of change in database
sizes currently exceeds that of processor speeds, computers
running BLAST are subjected to increasing load. However, the
conjunction of several new algorithmic ideas allow a new version
of BLAST to achieve improved sensitivity at substantially
augmented speed. This paper describes three major refinements
to BLAST.
(i) For increased speed, the criterion for extending word pairs
has been modified. The original BLAST program seeks short
word pairs whose aligned score is at least T. Each such ‘hit’ is then
extended, to test whether it is contained within a high-scoring
alignment. For the default T value, this extension step consumes
most of the processing time. The new ‘two-hit’ method requires
the existence of two non-overlapping word pairs on the same
diagonal, and within a distance A of one another, before an
extension is invoked. To achieve comparable sensitivity, the
threshold parameter T must be lowered, yielding more hits than
previously. However, because only a small fraction of these hits
are extended, the average amount of computation required
decreases.
(ii) The ability to generate gapped alignments has been added.
The original BLAST program often finds several alignments
involving a single database sequence which, when considered
together, are statistically significant. Overlooking any one of
these alignments can compromise the combined result. By
introducing an algorithm for generating gapped alignments, it
becomes necessary to find only one rather than all the ungapped
alignments subsumed in a significant result. This allows the T
parameter to be raised, increasing the speed of the initial database
scan. The new gapped alignment algorithm uses dynamic
programming to extend a central pair of aligned residues in both
directions. For speed, earlier heuristic methods (2,3) confined the
alignments produced to a predefined strip of the dynamic
alignments that drop in score no more than Xg below the best
score yet seen. The algorithm is able thereby to adapt the region
of the path graph it explores to the data.
(iii) BLAST searches may be iterated, with a position-specific
score matrix generated from significant alignments found in
round i used for round i + 1. Motif or profile search methods
frequently are much more sensitive than pairwise comparison
methods at detecting distant relationships. However, creating a
set of motifs or a profile that describes a protein family, and
searching a database with them, typically has involved running
several different programs, with substantial user intervention at
various stages. The BLAST algorithm is easily generalized to use
an arbitrary position-specific score matrix in place of a query
sequence and associated substitution matrix. Accordingly, we
have automated the procedure of generating such a matrix from
the output produced by a BLAST search, and adapted the BLAST
algorithm to take this matrix as input. The resulting Position-
Specific Iterated BLAST, or PSI-BLAST, program may not be as
sensitive as the best available motif search programs, but its speed
and ease of operation can bring the power of these methods into
more common use.
After describing these refinements to BLAST in greater detail,
we consider several biological examples for which the sensitivity
and speed of the program are greatly enhanced.