Description Usage Arguments Details Value Author(s) References See Also Examples
Given descriptor of a query compound and a database of compound descriptors, search for compounds that are similar to the query compound. User can limit the output by supplying a cutoff similarity score or a cutoff that limits the number of returned compounds. The function can also return the scores together with the compounds.
1 2 |
db |
The compound descriptor database returned by 'cmp.parse'. |
query |
The query descriptor, which is usually returned by 'cmp.parse1'. |
type |
Returns results in form of position indices (type=1), named vector with compound IDs (type=2) or data frame (type=3). |
cutoff |
The cutoff similarity (when cutoff <= 1) or the number of maximum compounds to be returned (when cutoff > 1). |
return.score |
Whether to return similarity scores. If set to TRUE, a data frame will be returned; otherwise, only the compounds' indices in the database will be returned in the order of decreasing scores. |
quiet |
Whether to disable progress information. |
mode |
Mode used when computing similarity scores. This value is passed to 'cmp.similarity'. |
visualize |
|
visualize.browse |
|
visualize.query |
'cmp.search' will go through all the compound descriptors in the database and calculate the similarity between the query compound and compounds in the database. When cutoff similarity score is set, compounds having a similarity score higher than the cutoff will be returned. When maximum number of compounds to return is set to N via 'cutoff', the compounds having the highest N similarity scores will be returned.
When 'return.score' is set to FALSE, a vector of matching compounds' indices in the database will be returned. Otherwise, a data frame will be returned:
ids |
The indices of matching compounds in the database. |
scores |
The similarity scores between the matching compounds and the query compound |
Y. Eddie Cao, Li-Chang Cheng
Chen X and Reynolds CH (2002). "Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients", in J Chem Inf Comput Sci.
cmp.parse1
, cmp.parse
,
cmp.search
, cmp.cluster
,
cmp.similarity
, sdf.visualize
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## Load sample SD file
# data(sdfsample); sdfset <- sdfsample
## Generate atom pair descriptor database for searching
# apset <- sdf2ap(sdfset)
## Loads same atom pair sample data set provided by library
data(apset)
db <- apset
query <- db[1]
## Ooptinally, save the db for future use
save(db, file="db.rda", compress=TRUE)
## Search for similar compounds using similarity cutoff
cmp.search(db, query, cutoff=0.2, type=1) # returns index
cmp.search(db, query, cutoff=0.2, type=2) # returns named vector
cmp.search(db, query, cutoff=0.2, type=3) # returns data frame
## in the next session, you may use load a saved db and do the search:
load("db.rda")
cmp.search(db, query, cutoff=3)
## you may also use the loaded db to do clustering:
cmp.cluster(db, cutoff=0.35)
|
| 4 %
/ 8 %
- 12 %
\ 16 %
| 20 %
/ 24 %
- 28 %
\ 32 %
| 36 %
/ 40 %
- 44 %
\ 48 %
| 52 %
/ 56 %
- 60 %
\ 64 %
| 68 %
/ 72 %
- 76 %
\ 80 %
| 84 %
/ 88 %
- 92 %
\ 96 %
| 100 %
[1] 1 96 67 88 15 77 31 98 86 83 64 85 72 4 2 51 23 74 11 38 79 70 75 25 93
[26] 32 69 52 43 63 47 66 91 78 94 3 16 18 99 39 68 45 71 20 22 9 12 92 61 60
[51] 19 40
/ 4 %
- 8 %
\ 12 %
| 16 %
/ 20 %
- 24 %
\ 28 %
| 32 %
/ 36 %
- 40 %
\ 44 %
| 48 %
/ 52 %
- 56 %
\ 60 %
| 64 %
/ 68 %
- 72 %
\ 76 %
| 80 %
/ 84 %
- 88 %
\ 92 %
| 96 %
/ 100 %
650001 650102 650072 650094 650015 650082 650032 650104
1.0000000 0.3516643 0.3117569 0.3094629 0.3010753 0.2960969 0.2848181 0.2777778
650092 650089 650069 650091 650077 650004 650002 650054
0.2739274 0.2738462 0.2736842 0.2724796 0.2674591 0.2641975 0.2637037 0.2633411
650024 650079 650011 650039 650085 650075 650080 650026
0.2581121 0.2575107 0.2559653 0.2539062 0.2518337 0.2506297 0.2496552 0.2485795
650099 650033 650074 650056 650044 650068 650048 650071
0.2438163 0.2410959 0.2408840 0.2330346 0.2322503 0.2321900 0.2320099 0.2301459
650097 650083 650100 650003 650016 650019 650105 650040
0.2251908 0.2225313 0.2208333 0.2185714 0.2176471 0.2163389 0.2159091 0.2127329
650073 650046 650076 650021 650023 650009 650012 650098
0.2124601 0.2112971 0.2107438 0.2099291 0.2098361 0.2098361 0.2082153 0.2071097
650066 650065 650020 650041
0.2065064 0.2065064 0.2034884 0.2019544
- 4 %
\ 8 %
| 12 %
/ 16 %
- 20 %
\ 24 %
| 28 %
/ 32 %
- 36 %
\ 40 %
| 44 %
/ 48 %
- 52 %
\ 56 %
| 60 %
/ 64 %
- 68 %
\ 72 %
| 76 %
/ 80 %
- 84 %
\ 88 %
| 92 %
/ 96 %
- 100 %
index cid scores
1 1 650001 1.0000000
2 96 650102 0.3516643
3 67 650072 0.3117569
4 88 650094 0.3094629
5 15 650015 0.3010753
6 77 650082 0.2960969
7 31 650032 0.2848181
8 98 650104 0.2777778
9 86 650092 0.2739274
10 83 650089 0.2738462
11 64 650069 0.2736842
12 85 650091 0.2724796
13 72 650077 0.2674591
14 4 650004 0.2641975
15 2 650002 0.2637037
16 51 650054 0.2633411
17 23 650024 0.2581121
18 74 650079 0.2575107
19 11 650011 0.2559653
20 38 650039 0.2539062
21 79 650085 0.2518337
22 70 650075 0.2506297
23 75 650080 0.2496552
24 25 650026 0.2485795
25 93 650099 0.2438163
26 32 650033 0.2410959
27 69 650074 0.2408840
28 52 650056 0.2330346
29 43 650044 0.2322503
30 63 650068 0.2321900
31 47 650048 0.2320099
32 66 650071 0.2301459
33 91 650097 0.2251908
34 78 650083 0.2225313
35 94 650100 0.2208333
36 3 650003 0.2185714
37 16 650016 0.2176471
38 18 650019 0.2163389
39 99 650105 0.2159091
40 39 650040 0.2127329
41 68 650073 0.2124601
42 45 650046 0.2112971
43 71 650076 0.2107438
44 20 650021 0.2099291
45 22 650023 0.2098361
46 9 650009 0.2098361
47 12 650012 0.2082153
48 92 650098 0.2071097
49 61 650066 0.2065064
50 60 650065 0.2065064
51 19 650020 0.2034884
52 40 650041 0.2019544
\ 4 %
| 8 %
/ 12 %
- 16 %
\ 20 %
| 24 %
/ 28 %
- 32 %
\ 36 %
| 40 %
/ 44 %
- 48 %
\ 52 %
| 56 %
/ 60 %
- 64 %
\ 68 %
| 72 %
/ 76 %
- 80 %
\ 84 %
| 88 %
/ 92 %
- 96 %
\ 100 %
[1] 1 96 67
|
/
- 1 %
\
|
/
-
\
|
/
-
\
|
/
-
\
|
/
-
\
|
/
-
\
|
/
-
\
|
/
-
\ 15 %
|
/
- 16 %
\
|
/
-
\
|
/
-
\ 20 %
|
/ 20 %
-
\ 21 %
|
/ 21 %
-
\ 22 %
|
/ 22 %
-
\ 23 %
|
/
- 24 %
\
| 24 %
/
- 25 %
\
|
/ 26 %
-
\ 26 %
|
/ 27 %
-
\ 27 %
|
/ 28 %
-
\ 28 %
|
/ 29 %
-
\ 29 %
|
/ 30 %
-
\ 30 %
|
/ 31 %
-
\ 31 %
|
/ 32 %
-
\ 32 %
|
/ 33 %
-
\ 33 %
|
/ 34 %
-
\ 34 %
|
/ 35 %
-
\ 35 %
|
/ 36 %
-
\
| 37 %
/
- 37 %
\
|
/ 38 %
-
\
| 39 %
/
-
\ 40 %
|
/
- 41 %
\
| 42 %
/
- 42 %
\
| 43 %
/
- 43 %
\
| 44 %
/
- 44 %
\
| 45 %
/
- 45 %
\
| 46 %
/
- 46 %
\
| 47 %
/
- 47 %
\
| 48 %
/
- 48 %
\
| 49 %
/
- 49 %
\
| 50 %
sorting result...
ids CLSZ_0.35 CLID_0.35
2 650002 28 2
8 650008 28 2
11 650011 28 2
15 650015 28 2
31 650032 28 2
38 650039 28 2
45 650046 28 2
47 650048 28 2
51 650054 28 2
52 650056 28 2
53 650058 28 2
63 650068 28 2
64 650069 28 2
65 650070 28 2
67 650072 28 2
69 650074 28 2
71 650076 28 2
75 650080 28 2
78 650083 28 2
79 650085 28 2
85 650091 28 2
86 650092 28 2
88 650094 28 2
91 650097 28 2
93 650099 28 2
94 650100 28 2
99 650105 28 2
100 650106 28 2
4 650004 8 4
12 650012 8 4
18 650019 8 4
32 650033 8 4
40 650041 8 4
77 650082 8 4
84 650090 8 4
98 650104 8 4
1 650001 2 1
96 650102 2 1
3 650003 2 3
7 650007 2 3
16 650016 2 16
72 650077 2 16
20 650021 2 20
28 650029 2 20
48 650049 2 48
49 650050 2 48
54 650059 2 54
55 650060 2 54
56 650061 2 56
57 650062 2 56
58 650063 2 58
59 650064 2 58
60 650065 2 60
61 650066 2 60
5 650005 1 5
6 650006 1 6
9 650009 1 9
10 650010 1 10
13 650013 1 13
14 650014 1 14
17 650017 1 17
19 650020 1 19
21 650022 1 21
22 650023 1 22
23 650024 1 23
24 650025 1 24
25 650026 1 25
26 650027 1 26
27 650028 1 27
29 650030 1 29
30 650031 1 30
33 650034 1 33
34 650035 1 34
35 650036 1 35
36 650037 1 36
37 650038 1 37
39 650040 1 39
41 650042 1 41
42 650043 1 42
43 650044 1 43
44 650045 1 44
46 650047 1 46
50 650052 1 50
62 650067 1 62
66 650071 1 66
68 650073 1 68
70 650075 1 70
73 650078 1 73
74 650079 1 74
76 650081 1 76
80 650086 1 80
81 650087 1 81
82 650088 1 82
83 650089 1 83
87 650093 1 87
89 650095 1 89
90 650096 1 90
92 650098 1 92
95 650101 1 95
97 650103 1 97
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.