数据摘要:
Classify hypothetical samples of gilled mushrooms in the Agaricus
and Lepiota family as edible or poisonous. From the UCI repository of machine learning databases.
中文关键词:
蘑菇,数据集,机器学习,食用,有毒,
英文关键词:
Mushrooms,dataset,Machine Learning,edible,poisonous,
数据格式:
TEXT
数据用途:
Information Processing
Classification
数据详细介绍:
Mushrooms dataset
The information is a replica of the notes for the mushroom dataset from the UCI repository of machine learning databases.
1. Title: Mushrooms Database 2. Sources:
Mushroom records drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf Donor: Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu) Date: 27 April 1987 3. Past Usage:
Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19). Doctoral disseration, Department of Information and Computer Science, University of California, Irvine. --- STAGGER: asymptoted to 95% classification accuracy after reviewing 1000 instances.
Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity and Coverage in Incremental Concept Learning. In Proceedings of the 5th International Conference on Machine Learning, 73-79. Ann Arbor, Michigan: Morgan Kaufmann. -- approximately the same results with their HILLARY algorithm 4. Relevant Information:
This data set includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family (pp. 500-525). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no simple rule for determining the edibility of a mushroom; no rule like ``leaflets three, let it be'' for Poisonous Oak and Ivy. 5. Number of Instances: 8124
6. Number of Attributes: 22 (all nominally valued) 7. Attribute Information: (classes: edible=e, poisonous=p)
cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s
cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y bruises?: bruises=t,no=f
5. odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s
gill-attachment: attached=a,descending=d,free=f,notched=n gill-spacing: close=c,crowded=w,distant=d gill-size: broad=b,narrow=n
gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y stalk-shape: enlarging=e,tapering=t
stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=? stalk-surface-above-ring: ibrous=f,scaly=y,silky=k,smooth=s stalk-surface-below-ring: ibrous=f,scaly=y,silky=k,smooth=s
stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y
stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y veil-type: partial=p,universal=u
veil-color: brown=n,orange=o,white=w,yellow=y ring-number: none=n,one=o,two=t
ring-type: cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheathing=s,zone=z
spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,white=w,yellow=y
population: abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y habitat: grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d 8. Missing Attribute Values:
2480 of them (denoted by \"?\"), all for attribute #11. 9. Class Distribution: edible: 4208 (51.8%)
poisonous: 3916 (48.2%) total: 8124 instances 10. Modifications for Delve
Stalk-root (attribute 11) is not used in the eat prototask because it has missing values.
数据预览:
点此下载完整数据集
因篇幅问题不能全部显示,请点此查看更多更全内容