女人的第二张脸是什么| 拉屎擦屁股纸上有血什么原因| 振水音阳性提示什么| 氨气是什么味道| 吃完饭就犯困是什么原因| 喝什么粥养胃| 首级是什么意思| 经常吃南瓜有什么好处和坏处| 心律不齐什么症状| 断袖是什么意思| 心肌缺血是什么原因引起的| 什么是蛋白质| 张飞的武器是什么| 大浪淘沙下一句是什么| rca是什么意思| 平光眼镜是什么意思| 不可磨灭是什么意思| 葡萄是什么意思| 攒是什么意思| 有氧运动是指什么| 1130是什么星座| 早上起床胃疼是什么原因| 宜入宅是什么意思| 甲沟炎去医院挂什么科| 延年益寿的益是什么意思| 什么是孽缘| 油性皮肤适合用什么护肤品| 什么叫肝腹水| 给猫咪取什么名字好听| 睡觉醒来口苦是什么原因| 茁壮的什么| 转卖是什么意思| 子衿什么意思| 穿刺活检是什么意思| 为什么吃完饭就想拉屎| 肛门潮湿瘙痒用什么药最好| 炒菜用什么油最健康| 冬天开什么花| 乙型肝炎表面抗体高是什么意思| 壅是什么意思| 男人气血不足吃什么药| 女性解脲支原体阳性吃什么药| 富二代是什么意思| 引狼入室是什么意思| 输血四项检查是什么| 腊肉炒什么最好吃| 亲嘴什么感觉| 权衡利弊的意思是什么| 奶芙是什么| 手指甲月牙代表什么| 都字五行属什么| 闭经有什么症状| 胃反流是什么原因| 嘴巴淡而无味是什么原因| 乌灵胶囊有什么副作用| 胃寒可以吃什么水果| 矫枉过正是什么意思| 偈语是什么意思| 勃而不坚吃什么药| 胃疼可以吃什么| 山楂搭配什么泡水喝好| 七月开什么花| 乙肝25阳性什么意思| 皮肤过敏用什么药最好| 搬家当天有什么讲究| 梦见自己打胎是什么意思| 男人为什么会出轨| 薄如蝉翼是什么意思| cbt是什么意思| 男人阳虚吃什么药最好| aupres是什么牌子化妆品| 天葬是什么| 唾液酸偏低意味什么| 夜莺是什么鸟| 外阴瘙痒用什么药膏擦| 猎德有什么好玩的| 孕妇不能吃什么东西| 冠军是什么意思| 黄体生成素是什么| 紫米和小米什么关系| 1970年属狗的是什么命| 酸菜鱼是什么地方的菜| 什么的去路| 车厘子什么季节吃| 用脚尖走路有什么好处| 什么东西可以去口臭呀| 惊蛰什么意思| 医院红色手环代表什么| 背道而驰什么意思| 睡觉磨牙是什么原因引起的| 梦见自己给自己理发是什么意思| 四季如春是什么生肖| 一心一意是什么生肖| 档次是什么意思| 斗破苍穹什么时候出的| 葛洲坝集团是什么级别| 什么情况下需要做肠镜检查| 梦到下雨是什么意思| 三高人群适合吃什么水果| 黄牛票是什么意思| 脑血管造影是什么意思| 吃什么对肺好| 手掌小鱼际发红是什么原因| 金光是什么生肖| 小熊猫长什么样| 7月18日是什么日子| 维和部队是干什么的| 一直干咳是什么原因| 长白毛是什么原因| 金酒是什么酒| 巳蛇五行属什么| 榴莲什么时候吃是应季| 长期口臭挂什么科| 苹果醋什么时候喝最好| 人的五官指什么| 维生素c什么牌子好| 着凉了吃什么药| 汾酒是什么香型| 腺肌瘤是什么病| mickey是什么牌子| 梦见买黄金是什么兆头| 鼻子上长红疙瘩是什么原因| 活碱是什么| 焦虑什么意思| 拔罐出水泡是什么原因| 生物公司是做什么的| 小叶增生吃什么药好| 伤官女是什么意思| 680分能上什么大学| 财多身弱什么意思| 为什么一站起来就头晕眼前发黑| 开山鼻祖是什么意思| 黄瓜为什么叫黄瓜| 心肌损伤是什么意思| 否命题和命题的否定有什么区别| 右边脑袋疼是什么原因| 药师是干什么的| lv什么牌子| 土中金是什么生肖| 土茯苓和什么煲汤最好| 金银花什么时候开花| 经常眨眼睛是什么原因| 油麦菜不能和什么一起吃| 什么的夕阳| 舌炎是什么症状| 四肢肿胀是什么原因引起的| 吃榴莲对妇科病有什么好处| 下半年有什么节日| 党员有什么好处| 心脏早搏是什么症状| 豆面是什么| 梦见自己买衣服是什么意思| 艾拉是什么药这么贵| 1943年属羊的是什么命| 辣条是什么意思| yet是什么意思| 牛蛋是什么| 甲钴胺治疗什么病| 肌酐升高是什么原因| acc是什么意思| 手麻挂什么科最好| 男性解脲支原体是什么病| 蟠桃为什么是扁的| 沙僧的武器叫什么名字| 茴三硫片主治什么| 为什么头发会变白| 缩影是什么意思| 脏器灰阶立体成像是检查什么的| 七星鱼吃什么食物| 百什么齐什么| 莲藕什么时候种植最佳| 心脏病有什么症状表现| 土贝什么字| 仙人掌有什么功效| 什么情况下需要做喉镜检查| castle什么意思| 什么是孢子| 吃什么食物能养肝护肝| icu和ccu有什么区别| 头晕是什么病的前兆| 性疾病都有什么症状| 柠檬水苦是什么原因| 今年是什么生肖| 今天股市为什么暴跌| 蒙氏教育是什么| 八月一日是什么日子| 藤茶是什么茶| 头顶一阵一阵疼是什么原因| 梦见房子漏水是什么意思| 股票放量是什么意思| 银为什么会变黑| 咬牙齿是什么原因| 你好是什么意思| 天津市市长是什么级别| 慢性宫颈炎吃什么药| 吃什么对肝好怎么养肝| 畏光是什么意思| 新生儿吃什么钙好| 水可以变成什么| 乘晕宁又叫什么| 羊肉不能和什么食物一起吃| 农历12月是什么星座| 花痴病是什么症状| 梦到钓鱼是什么征兆| 纹身有什么危害| 食用葡萄糖是什么| 睾丸炎吃什么药好得快| 晚餐吃什么健康又营养| 青石是什么石头| 海柳什么颜色最贵的| 暖五行属什么| 眉尾长痘是什么原因| 编外人员是什么意思| 打眼是什么意思| 这什么| 金戊念什么| 白术有什么功效| 此刻朋友这杯酒最珍贵是什么歌| 乳腺增生是什么症状| 细菌性痢疾症状是什么| fna是什么意思| 人头马是什么酒| 特别怕热爱出汗是什么原因| 生吃紫苏叶有什么功效| 撸是什么意思| levis是什么牌子| 绿豆汤是什么颜色| 排卵期同房后要注意什么| 肺癌靶向治疗是什么意思| 孙膑原名叫什么| 缺钾会有什么症状| 风水宝地是什么生肖| 彩铃是什么意思| 鼻尖长痣代表什么| 阴吹是什么意思| 女人排卵是什么时候| dm是什么单位| 宰相的宰最早指什么| 一个尔一个玉念什么| 科技皮是什么皮| 非户籍是什么意思| 验血脂挂什么科| 什么是种草| 什么时候打仗| 心急吃不了热豆腐什么意思| 兆上面是什么单位| 拜谢是什么意思| 亥时右眼跳是什么预兆| rhubarb是什么意思| tf是什么| 吃什么食物治便秘| 痔疮什么情况下需要做手术| 黑色碳素笔是什么| 血压高吃什么水果好| 审计署是什么级别| 什么叫易经| 蚂蚁上树什么姿势| 看静脉曲张挂什么科| 鸡冠花什么时候开花| 吹牛皮是什么意思| 12月28是什么星座| 九月二十号是什么星座| 吃什么解酒| 孩子睡觉磨牙是什么原因| 百度Jump to content

老人手抖是什么原因

From Wikipedia, the free encyclopedia
百度 据悉,美国这一所谓的“拨款援助”由来已久。

Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems.[1] Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use.[1][2][3][4] Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.[5] Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.[1]

The term "data mining" is a misnomer because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself.[6] It also is a buzzword[7] and is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support systems, including artificial intelligence (e.g., machine learning) and business intelligence. Often the more general terms (large scale) data analysis and analytics—or, when referring to actual methods, artificial intelligence and machine learning—are more appropriate.

The actual data mining task is the semi-automatic or automatic analysis of massive quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data collection, data preparation, nor result interpretation and reporting is part of the data mining step, although they do belong to the overall KDD process as additional steps.

The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data. In contrast, data mining uses machine learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.[8]

The related terms data dredging, data fishing, and data snooping refer to the use of data mining methods to sample parts of a larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These methods can, however, be used in creating new hypotheses to test against the larger data populations.

Etymology

[edit]

In the 1960s, statisticians and economists used terms like data fishing or data dredging to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis. The term "data mining" was used in a similarly critical way by economist Michael Lovell in an article published in the Review of Economic Studies in 1983.[9][10] Lovell indicates that the practice "masquerades under a variety of aliases, ranging from "experimentation" (positive) to "fishing" or "snooping" (negative).

The term data mining appeared around 1990 in the database community, with generally positive connotations. For a short time in 1980s, the phrase "database mining"?, was used, but since it was trademarked by HNC, a San Diego–based company, to pitch their Database Mining Workstation;[11] researchers consequently turned to data mining. Other terms used include data archaeology, information harvesting, information discovery, knowledge extraction, etc. Gregory Piatetsky-Shapiro coined the term "knowledge discovery in databases" for the first workshop on the same topic (KDD-1989) and this term became more popular in the AI and machine learning communities. However, the term data mining became more popular in the business and press communities.[12] Currently, the terms data mining and knowledge discovery are used interchangeably.

Background

[edit]

The manual extraction of patterns from data has occurred for centuries. Early methods of identifying patterns in data include Bayes' theorem (1700s) and regression analysis (1800s).[13] The proliferation, ubiquity and increasing power of computer technology have dramatically increased data collection, storage, and manipulation ability. As data sets have grown in size and complexity, direct "hands-on" data analysis has increasingly been augmented with indirect, automated data processing, aided by other discoveries in computer science, specially in the field of machine learning, such as neural networks, cluster analysis, genetic algorithms (1950s), decision trees and decision rules (1960s), and support vector machines (1990s). Data mining is the process of applying these methods with the intention of uncovering hidden patterns.[14] in large data sets. It bridges the gap from applied statistics and artificial intelligence (which usually provide the mathematical background) to database management by exploiting the way data is stored and indexed in databases to execute the actual learning and discovery algorithms more efficiently, allowing such methods to be applied to ever-larger data sets.

Process

[edit]

The knowledge discovery in databases (KDD) process is commonly defined with the stages:

  1. Selection
  2. Pre-processing
  3. Transformation
  4. Data mining
  5. Interpretation/evaluation.[5]

It exists, however, in many variations on this theme, such as the Cross-industry standard process for data mining (CRISP-DM) which defines six phases:

  1. Business understanding
  2. Data understanding
  3. Data preparation
  4. Modeling
  5. Evaluation
  6. Deployment

or a simplified process such as (1) Pre-processing, (2) Data Mining, and (3) Results Validation.

Polls conducted in 2002, 2004, 2007 and 2014 show that the CRISP-DM methodology is the leading methodology used by data miners.[15][16][17][18]

The only other data mining standard named in these polls was SEMMA. However, 3–4 times as many people reported using CRISP-DM. Several teams of researchers have published reviews of data mining process models,[19] and Azevedo and Santos conducted a comparison of CRISP-DM and SEMMA in 2008.[20]

Pre-processing

[edit]

Before data mining algorithms can be used, a target data set must be assembled. As data mining can only uncover patterns actually present in the data, the target data set must be large enough to contain these patterns while remaining concise enough to be mined within an acceptable time limit. A common source for data is a data mart or data warehouse. Pre-processing is essential to analyze the multivariate data sets before data mining. The target set is then cleaned. Data cleaning removes the observations containing noise and those with missing data.

Data mining

[edit]

Data mining involves six common classes of tasks:[5]

  • Anomaly detection (outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or data errors that require further investigation due to being out of standard range.
  • Association rule learning (dependency modeling) – Searches for relationships between variables. For example, a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.
  • Clustering – is the task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.
  • Classification – is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".
  • Regression – attempts to find a function that models the data with the least error that is, for estimating the relationships among data or datasets.
  • Summarization – providing a more compact representation of the data set, including visualization and report generation.

Results validation

[edit]
An example of data produced by data dredging through a bot operated by statistician Tyler Vigen, apparently showing a close link between the best word winning a spelling bee competition and the number of people in the United States killed by venomous spiders

Data mining can unintentionally be misused, producing results that appear to be significant but which do not actually predict future behavior and cannot be reproduced on a new sample of data, therefore bearing little use. This is sometimes caused by investigating too many hypotheses and not performing proper statistical hypothesis testing. A simple version of this problem in machine learning is known as overfitting, but the same problem can arise at different phases of the process and thus a train/test split—when applicable at all—may not be sufficient to prevent this from happening.[21]

The final step of knowledge discovery from data is to verify that the patterns produced by the data mining algorithms occur in the wider data set. Not all patterns found by the algorithms are necessarily valid. It is common for data mining algorithms to find patterns in the training set which are not present in the general data set. This is called overfitting. To overcome this, the evaluation uses a test set of data on which the data mining algorithm was not trained. The learned patterns are applied to this test set, and the resulting output is compared to the desired output. For example, a data mining algorithm trying to distinguish "spam" from "legitimate" e-mails would be trained on a training set of sample e-mails. Once trained, the learned patterns would be applied to the test set of e-mails on which it had not been trained. The accuracy of the patterns can then be measured from how many e-mails they correctly classify. Several statistical methods may be used to evaluate the algorithm, such as ROC curves.

If the learned patterns do not meet the desired standards, it is necessary to re-evaluate and change the pre-processing and data mining steps. If the learned patterns do meet the desired standards, then the final step is to interpret the learned patterns and turn them into knowledge.

Research

[edit]

The premier professional body in the field is the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining (SIGKDD).[22][23] Since 1989, this ACM SIG has hosted an annual international conference and published its proceedings,[24] and since 1999 it has published a biannual academic journal titled "SIGKDD Explorations".[25]

Computer science conferences on data mining include:

Data mining topics are also present in many data management/database conferences such as the ICDE Conference, SIGMOD Conference and International Conference on Very Large Data Bases.

Standards

[edit]

There have been some efforts to define standards for the data mining process, for example, the 1999 European Cross Industry Standard Process for Data Mining (CRISP-DM 1.0) and the 2004 Java Data Mining standard (JDM 1.0). Development on successors to these processes (CRISP-DM 2.0 and JDM 2.0) was active in 2006 but has stalled since. JDM 2.0 was withdrawn without reaching a final draft.

For exchanging the extracted models—in particular for use in predictive analytics—the key standard is the Predictive Model Markup Language (PMML), which is an XML-based language developed by the Data Mining Group (DMG) and supported as exchange format by many data mining applications. As the name suggests, it only covers prediction models, a particular data mining task of high importance to business applications. However, extensions to cover (for example) subspace clustering have been proposed independently of the DMG.[26]

Notable uses

[edit]

Data mining is used wherever there is digital data available. Notable examples of data mining can be found throughout business, medicine, science, finance, construction, and surveillance.

Privacy concerns and ethics

[edit]

While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to user behavior (ethical and otherwise).[27]

The ways in which data mining can be used can in some cases and contexts raise questions regarding privacy, legality, and ethics.[28] In particular, data mining government or commercial data sets for national security or law enforcement purposes, such as in the Total Information Awareness Program or in ADVISE, has raised privacy concerns.[29][30]

Data mining requires data preparation which uncovers information or patterns which compromise confidentiality and privacy obligations. A common way for this to occur is through data aggregation. Data aggregation involves combining data together (possibly from various sources) in a way that facilitates analysis (but that also might make identification of private, individual-level data deducible or otherwise apparent).[31] This is not data mining per se, but a result of the preparation of data before—and for the purposes of—the analysis. The threat to an individual's privacy comes into play when the data, once compiled, cause the data miner, or anyone who has access to the newly compiled data set, to be able to identify specific individuals, especially when the data were originally anonymous.[32]

Data may also be modified so as to become anonymous, so that individuals may not readily be identified.[31] However, even "anonymized" data sets can potentially contain enough information to allow identification of individuals, as occurred when journalists were able to find several individuals based on a set of search histories that were inadvertently released by AOL.[33]

The inadvertent revelation of personally identifiable information leading to the provider violates Fair Information Practices. This indiscretion can cause financial, emotional, or bodily harm to the indicated individual. In one instance of privacy violation, the patrons of Walgreens filed a lawsuit against the company in 2011 for selling prescription information to data mining companies who in turn provided the data to pharmaceutical companies.[34]

Situation in Europe

[edit]

Europe has rather strong privacy laws, and efforts are underway to further strengthen the rights of the consumers. However, the U.S.–E.U. Safe Harbor Principles, developed between 1998 and 2000, currently effectively expose European users to privacy exploitation by U.S. companies. As a consequence of Edward Snowden's global surveillance disclosure, there has been increased discussion to revoke this agreement, as in particular the data will be fully exposed to the National Security Agency, and attempts to reach an agreement with the United States have failed.[35]

In the United Kingdom in particular there have been cases of corporations using data mining as a way to target certain groups of customers forcing them to pay unfairly high prices. These groups tend to be people of lower socio-economic status who are not savvy to the ways they can be exploited in digital market places.[36]

Situation in the United States

[edit]

In the United States, privacy concerns have been addressed by the US Congress via the passage of regulatory controls such as the Health Insurance Portability and Accountability Act (HIPAA). The HIPAA requires individuals to give their "informed consent" regarding information they provide and its intended present and future uses. According to an article in Biotech Business Week, "'[i]n practice, HIPAA may not offer any greater protection than the longstanding regulations in the research arena,' says the AAHC. More importantly, the rule's goal of protection through informed consent is approach a level of incomprehensibility to average individuals."[37] This underscores the necessity for data anonymity in data aggregation and mining practices.

U.S. information privacy legislation such as HIPAA and the Family Educational Rights and Privacy Act (FERPA) applies only to the specific areas that each such law addresses. The use of data mining by the majority of businesses in the U.S. is not controlled by any legislation.

[edit]

Situation in Europe

[edit]

Under European copyright database laws, the mining of in-copyright works (such as by web mining) without the permission of the copyright owner is not legal. Where a database is pure data in Europe, it may be that there is no copyright—but database rights may exist, so data mining becomes subject to intellectual property owners' rights that are protected by the Database Directive. On the recommendation of the Hargreaves review, this led to the UK government to amend its copyright law in 2014 to allow content mining as a limitation and exception.[38] The UK was the second country in the world to do so after Japan, which introduced an exception in 2009 for data mining. However, due to the restriction of the Information Society Directive (2001), the UK exception only allows content mining for non-commercial purposes. UK copyright law also does not allow this provision to be overridden by contractual terms and conditions. Since 2020 also Switzerland has been regulating data mining by allowing it in the research field under certain conditions laid down by art. 24d of the Swiss Copyright Act. This new article entered into force on 1 April 2020.[39]

The European Commission facilitated stakeholder discussion on text and data mining in 2013, under the title of Licences for Europe.[40] The focus on the solution to this legal issue, such as licensing rather than limitations and exceptions, led to representatives of universities, researchers, libraries, civil society groups and open access publishers to leave the stakeholder dialogue in May 2013.[41]

Situation in the United States

[edit]

US copyright law, and in particular its provision for fair use, upholds the legality of content mining in America, and other fair use countries such as Israel, Taiwan and South Korea. As content mining is transformative, that is it does not supplant the original work, it is viewed as being lawful under fair use. For example, as part of the Google Book settlement the presiding judge on the case ruled that Google's digitization project of in-copyright books was lawful, in part because of the transformative uses that the digitization project displayed—one being text and data mining.[42]

Software

[edit]

Free open-source data mining software and applications

[edit]

The following applications are available under free/open-source licenses. Public access to application source code is also available.

Proprietary data-mining software and applications

[edit]

The following applications are available under proprietary licenses.

See also

[edit]
Methods
Application domains
Application examples
Related topics

For more information about extracting information out of data (as opposed to analyzing data), see:

Other resources

References

[edit]
  1. ^ a b c "Data Mining Curriculum". ACM SIGKDD. 2025-08-06. Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  2. ^ Clifton, Christopher (2010). "Encyclop?dia Britannica: Definition of Data Mining". Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  3. ^ Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2009). "The Elements of Statistical Learning: Data Mining, Inference, and Prediction". Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  4. ^ Han, Jaiwei; Kamber, Micheline; Pei, Jian (2011). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann. ISBN 978-0-12-381479-1.
  5. ^ a b c Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). "From Data Mining to Knowledge Discovery in Databases" (PDF). Archived (PDF) from the original on 2025-08-06. Retrieved 17 December 2008.
  6. ^ Han, Jiawei; Kamber, Micheline (2001). Data mining: concepts and techniques. Morgan Kaufmann. p. 5. ISBN 978-1-55860-489-6. Thus, data mining should have been more appropriately named "knowledge mining from data," which is unfortunately somewhat long
  7. ^ OKAIRP 2005 Fall Conference, Arizona State University Archived 2025-08-06 at the Wayback Machine
  8. ^ Olson, D. L. (2007). Data mining in business services. Service Business, 1(3), 181–193. doi:10.1007/s11628-006-0014-7
  9. ^ Lovell, Michael C. (1983). "Data Mining". The Review of Economics and Statistics. 65 (1): 1–12. doi:10.2307/1924403. JSTOR 1924403.
  10. ^ Charemza, Wojciech W.; Deadman, Derek F. (1992). "Data Mining". New Directions in Econometric Practice. Aldershot: Edward Elgar. pp. 14–31. ISBN 1-85278-461-X.
  11. ^ Mena, Jesús (2011). Machine Learning Forensics for Law Enforcement, Security, and Intelligence. Boca Raton, FL: CRC Press (Taylor & Francis Group). ISBN 978-1-4398-6069-4.
  12. ^ Piatetsky-Shapiro, Gregory; Parker, Gary (2011). "Lesson: Data Mining, and Knowledge Discovery: An Introduction". Introduction to Data Mining. KD Nuggets. Archived from the original on 30 August 2012. Retrieved 30 August 2012.
  13. ^ Coenen, Frans (2025-08-06). "Data mining: past, present and future". The Knowledge Engineering Review. 26 (1): 25–29. doi:10.1017/S0269888910000378. ISSN 0269-8889. S2CID 6487637. Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  14. ^ Kantardzic, Mehmed (2003). Data Mining: Concepts, Models, Methods, and Algorithms. John Wiley & Sons. ISBN 978-0-471-22852-3. OCLC 50055336.
  15. ^ "What main methodology are you using for data mining (2002)?". KDnuggets. 2002. Archived from the original on 16 January 2017. Retrieved 29 December 2023.
  16. ^ "What main methodology are you using for data mining (2004)?". KDnuggets. 2004. Archived from the original on 8 February 2017. Retrieved 29 December 2023.
  17. ^ "What main methodology are you using for data mining (2007)?". KDnuggets. 2007. Archived from the original on 17 November 2012. Retrieved 29 December 2023.
  18. ^ "What main methodology are you using for data mining (2014)?". KDnuggets. 2014. Archived from the original on 1 August 2016. Retrieved 29 December 2023.
  19. ^ Lukasz Kurgan and Petr Musilek: "A survey of Knowledge Discovery and Data Mining process models" Archived 2025-08-06 at the Wayback Machine. The Knowledge Engineering Review. Volume 21 Issue 1, March 2006, pp 1–24, Cambridge University Press, New York, doi:10.1017/S0269888906000737
  20. ^ Azevedo, A. and Santos, M. F. KDD, SEMMA and CRISP-DM: a parallel overview Archived 2025-08-06 at the Wayback Machine. In Proceedings of the IADIS European Conference on Data Mining 2008, pp 182–185.
  21. ^ Hawkins, Douglas M (2004). "The problem of overfitting". Journal of Chemical Information and Computer Sciences. 44 (1): 1–12. doi:10.1021/ci0342472. PMID 14741005. S2CID 12440383.
  22. ^ "Microsoft Academic Search: Top conferences in data mining". Microsoft Academic Search. Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  23. ^ "Google Scholar: Top publications - Data Mining & Analysis". Google Scholar. Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  24. ^ Proceedings Archived 2025-08-06 at the Wayback Machine, International Conferences on Knowledge Discovery and Data Mining, ACM, New York.
  25. ^ SIGKDD Explorations Archived 2025-08-06 at the Wayback Machine, ACM, New York.
  26. ^ Günnemann, Stephan; Kremer, Hardy; Seidl, Thomas (2011). "An extension of the PMML standard to subspace clustering models". Proceedings of the 2011 workshop on Predictive markup language modeling. p. 48. doi:10.1145/2023598.2023605. ISBN 978-1-4503-0837-3. S2CID 14967969.
  27. ^ Seltzer, William (2005). "The Promise and Pitfalls of Data Mining: Ethical Issues" (PDF). ASA Section on Government Statistics. American Statistical Association. Archived (PDF) from the original on 2025-08-06.
  28. ^ Pitts, Chip (15 March 2007). "The End of Illegal Domestic Spying? Don't Count on It". Washington Spectator. Archived from the original on 2025-08-06.
  29. ^ Taipale, Kim A. (15 December 2003). "Data Mining and Domestic Security: Connecting the Dots to Make Sense of Data". Columbia Science and Technology Law Review. 5 (2). OCLC 45263753. SSRN 546782. Archived from the original on 5 November 2014. Retrieved 21 April 2004.
  30. ^ Resig, John. "A Framework for Mining Instant Messaging Services" (PDF). Archived (PDF) from the original on 2025-08-06. Retrieved 16 March 2018.
  31. ^ a b Think Before You Dig: Privacy Implications of Data Mining & Aggregation Archived 2025-08-06 at the Wayback Machine, NASCIO Research Brief, September 2004
  32. ^ Ohm, Paul. "Don't Build a Database of Ruin". Harvard Business Review.
  33. ^ AOL search data identified individuals Archived 2025-08-06 at the Wayback Machine, SecurityFocus, August 2006
  34. ^ Kshetri, Nir (2014). "Big data's impact on privacy, security and consumer welfare" (PDF). Telecommunications Policy. 38 (11): 1134–1145. doi:10.1016/j.telpol.2014.10.002. Archived (PDF) from the original on 2025-08-06. Retrieved 2025-08-06.
  35. ^ Weiss, Martin A.; Archick, Kristin (19 May 2016). "U.S.–E.U. Data Privacy: From Safe Harbor to Privacy Shield". Washington, D.C. Congressional Research Service. p. 6. R44257. Archived from the original (PDF) on 9 April 2020. Retrieved 9 April 2020. On October 6, 2015, the CJEU ... issued a decision that invalidated Safe Harbor (effective immediately), as currently implemented.
  36. ^ Parker, George (2025-08-06). "UK companies targeted for using big data to exploit customers". Financial Times. Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  37. ^ Biotech Business Week Editors (June 30, 2008); BIOMEDICINE; HIPAA Privacy Rule Impedes Biomedical Research, Biotech Business Week, retrieved 17 November 2009 from LexisNexis Academic
  38. ^ UK Researchers Given Data Mining Right Under New UK Copyright Laws. Archived June 9, 2014, at the Wayback Machine Out-Law.com. Retrieved 14 November 2014
  39. ^ "Fedlex". Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  40. ^ "Licences for Europe – Structured Stakeholder Dialogue 2013". European Commission. Archived from the original on 23 March 2013. Retrieved 14 November 2014.
  41. ^ "Text and Data Mining:Its importance and the need for change in Europe". Association of European Research Libraries. Archived from the original on 29 November 2014. Retrieved 14 November 2014.
  42. ^ "Judge grants summary judgment in favor of Google Books – a fair use victory". Lexology.com. Antonelli Law Ltd. 19 November 2013. Archived from the original on 29 November 2014. Retrieved 14 November 2014.

Further reading

[edit]
[edit]
吃什么补充胶原蛋白 红萝卜和胡萝卜有什么区别 caring什么意思 梦见殡仪馆是什么意思 为什么你
肝岛是什么意思 耳石症是什么意思 什么人容易怀葡萄胎 25羟维生素d测定是什么 1955属什么生肖
夜尿多是什么原因引起的 哦买噶什么意思 郑声是什么意思 干咳吃什么食物好 合影是什么意思
三点水加四读什么 羽毛球拍u是什么意思 寸脉弱是什么原因 口舌是非是什么意思 什么笔记本电脑好
梦见吃豆腐是什么意思hcv9jop8ns0r.cn 孕妇应该吃什么蔬菜hcv8jop9ns3r.cn 什么工作挣钱多hcv7jop6ns4r.cn 小孩眨眼睛是什么原因hcv9jop0ns4r.cn 结膜囊在眼睛什么位置aiwuzhiyu.com
什么是思维导图zhongyiyatai.com 房中术是什么意思hcv8jop4ns5r.cn 什么鱼最大hcv8jop7ns6r.cn 镜框什么材质好hcv8jop4ns6r.cn 胃动力不足是什么原因造成的xinjiangjialails.com
兰台是什么意思hcv7jop7ns4r.cn 28岁属什么的hcv9jop5ns5r.cn 文书是什么意思hcv8jop6ns4r.cn 红面是什么面hcv8jop0ns2r.cn 角色扮演是什么意思hcv8jop0ns0r.cn
芝兰是什么意思hcv8jop1ns9r.cn 男人占有欲强说明什么hcv8jop5ns4r.cn 治妇科炎症用什么药好hcv8jop5ns4r.cn 气管炎挂什么科hcv9jop3ns7r.cn 上海的市花是什么花aiwuzhiyu.com
百度