Wednesday, July 17, 2019

Isds Ch 5

Business Intelligence, 2e (Turban/Sharda/Delen/King) Chapter 5 school text editionbook and nett archeological settle 1) DARPA and MITRE teamed up to cultivate capabilities to mechani gossipy filter textual matter editionual matter edition- viled training rootages to depict action commensurate training in a timely manner. rejoinder authorized Diff 2Page ref 190 2) A vast volume of business entropy is captured and stored in text accounts that argon twistd. fulfil simulated Diff 2Page referee 192 3) school text mine is all-important(a) to matched advantage because familiarity is power, and fellowship is derived from text selective training sources. cause reliable Diff 2Page ref 192 ) The take and entirelyt againstes of text digging argon different from those of information digging because with text archeological berth the input to the mathematical operation argon information files much(prenominal)(prenominal) as reciprocation documents, P DF files, text excerpts, and XML files. root imitation Diff 3Page reviewer 192 5) The benefits of text dig ar sterling(prenominal) in aras where real striking amounts of textual selective information atomic number 18 being generated, much(prenominal) as law, academic research, finance, and medicine. firmness professedly Diff 2Page reviewer 192 6) Un coordinate entropy has a predetermined format. It is ordinarily nonionized into records as categorical, ordinal, and continuous variables and stored in selective information rear ends. reply FALSE Diff 2Page referee 193 7) Stemming is the operate of trim inflected nomenclature to their base or root form. settle TRUE Diff 1Page ref 193 8) pulley block lyric, such(prenominal)(prenominal) as a, am, the, and was, are rowing that are filtered kayoed prior to or aft(prenominal) affect of inhering oral communication data. response TRUE Diff 2Page referee 193 9) The goal of graphic delivery touch (natural expression touch on) is syntax-driven text manipulation. do FALSE Diff 2Page referee 196 10) Two advantages associated with the execution of NLP are word backb unmatchable disam good-lookinguation and syntactic ambiguity. event FALSE Diff 2Page ref 196 1) By applying a learning algorithm to parsed text, researchers from Stanford Universitys NLP lab find demonstrable methods that mess mechanically identify the concepts and relationships among those concepts in the text. do TRUE Diff 2Page referee 197 12) text archeological site weed be use to increase cross-selling and up-selling by analyzing the amorphous data generated by call centers. manage TRUE Diff 1Page reviewer 200 13) Compared to polygraphs for untruth- perception, text- ground deception detection has the advantages of being nonintrusive and widely pertinent to textual data and transcriptions of voice recordings.Answer TRUE Diff 2Page referee 201 14) The main purpose of establishing the head is to colle ct all of the documents related to the condition being studied. Answer TRUE Diff 2Page reader 207 15) The main categories of association extraction methods are recall, search, and signaling. Answer FALSE Diff 2Page reviewer 210 16) weather vane pages consisting of unregulated textual data coded in HTML and lumbers of visitants interactions provide rich data that gutter easily provide powerful and efficient experience breakthrough. Answer FALSE Diff 3Page ref 217 7) clear crawlers are clear gist digging tools that are apply to read through the content of a electronic network billet machinelikeally. Answer FALSE Diff 1Page ref 218 18) Amazon. com leverages weather vane tradition biography dynamically and recognizes the exploiter by noesis a cookie written by a sack up site on the visitors calculator. Answer TRUE Diff 1Page reviewer 221 19) The forest of search results is impossible to measure accurately utilize strictly observed measures such as click-th rough rate, abandonment, and search frequency. Additional quantitative and qualitative measures are required. Answer TRUE Diff 2Page reviewer 222 0) Customer experience commission occupations gather and report direct feedback from site visitors by benchmarking against other sites and offline channels, and by documentation prophetical modeling of future visitor behavior. Answer FALSE Diff 3Page referee 224 21) A vast majority of business data are stored in text documents that are ________. A) mostly quantitative B) virtually unregulated C) semi- social structured D) highly structured Answer B Diff 1Page ref 192 22) Text dig is the semi-automated operation of extracting ________ from large amounts of formless data sources.A) patterns B) utilitarian information C) knowledge D) all of the supra Answer D Diff 2Page reviewer 192 23) all(a) of the following(a) are pop application areas of text excavation chuck out A) information extraction B) document summarization C) irre solution answering D) data structuring Answer D Diff 2Page referee 193 24) Which of the following objurgately defines a text dig term? A) Tagging is the total of times a word is instal in a specialized document. B) A token is an uncategorized block of text in a sentence. C) Rooting is the routine of reducing inflected words to their base form.D) A term is a hotshot word or multiword phrase extracted instanter from the corpus by means of NLP methods. Answer D Diff 3Page reviewer 194 25) ________ is a branch of the field of linguistics and a demote of natural lyric poem processing that studies the internal structure of words. A) Morphology B) principal sum C) Stemming D) Polysemes Answer A Diff 2Page ref 194 26) employ ________ as a rich source of knowledge and a strategic weapon, Kodak non only survives provided excels in its foodstuff segment defined by institution and constant change. A) visualization B) deception detection C) patent analysis D) semantic cuesAnsw er C Diff 2Page Ref 194 27) It has been shown that the bag-of-word method may non produce good enough information content for text mining difficultys. more advanced techniques such as ________ are needed. A) smorgasbord B) natural verbiage processing C) evidence-based processing D) symbolic processing Answer B Diff 2Page Ref 195 28) wherefore ordain computers believably non be able to conceive natural language the identical delegacy and with the akin accuracy that military personnel do? A) A square(a) brain of meaning requires extensive knowledge of a division beyond what is in the words, sentences, and paragraphs.B) The natural human language is too particular. C) The part of speech depends only on the commentary and not on the scope within which it is use. D) All of the above. Answer A Diff 3Page Ref 196 29) At a very high level, the text mining process consists of apiece of the following jobs except A) create log frequencies B) establish the corpus C) crea te the term-document ground substance D) extract the knowledge Answer A Diff 2Page Ref 207 30) In ________, the problem is to host an unlabelled collection of objects, such as documents, customer comments, and net pages into meaningful groups without all prior knowledge.A) search recall B) classification C) lump D) grouping Answer C Diff 2Page Ref 211 31) The two main approaches to text classification are ________ and ________. A) knowledge engineering science machine learning B) categorization clustering C) association trend analysis D) knowledge extraction association Answer A Diff 2Page Ref 211 32) Commercial software tools let in all of the following except A) penetration B) IBM Intelligent Miner info Mining Suite C) SAS Text Miner D) SPSS Text Mining Answer A Diff 2Page Ref 216 33) Why does the clear pose corking challenges for effective and efficient knowledge discovery?A) The blade search engines are indexed-based. B) The weave is too dynamic. C) The net is too s pecific to a field of battle. D) The clear infrastructure conceals hyperlink information. Answer B Diff 2Page Ref 217 34) A plain keyword-based search engine suffers from several deficiencies, which embroil all of the following except A) a topic of any breath commode easily contain hundreds or thousands of documents B) numerous documents that are highly pertinent to a topic may not contain the exact keywords defining them C) web mining poop identify authoritative sack up pages D) many of the search results are marginally or not relevant to the topic Answer CDiff 3Page Ref 217 35) Which of the following is not one of the leash main areas of net mining? A) tissue search mining B) meshwork content mining C) sack structure mining D) sack up utilisation mining Answer A Diff 2Page Ref 218 36) Which of the following refers to developing recyclable information from the golf think included in the tissue documents? A) weather vane content mining B) weathervane subject mi ning C) Web structure mining D) Web matter mining Answer C Diff 2Page Ref 219 37) A ________ is one or more Web pages that provide a collection of cogitate to authoritative pages, reference sites, or a resource careen on a specific topic.A) hub B) hyperlink-induced topic search C) verbalise D) community Answer A Diff 2Page Ref 219 38) All of the following are parts of data generated through Web page visits except A) data stored in server main course logs, referrer logs, agent logs, and client-side cookies B) user profiles C) hyperlink analysis D) metadata, such as page attributes, content attributes, and usage data Answer C Diff 2Page Ref 220 39) When registered users revisit Amazon. com, they are greeted by name. This task involves recognizing the user by ________. A) pattern discovery B) association C) text miningD) reading a cookie Answer D Diff 1Page Ref 221 40) Forward-thinking companies like Ask. com, Scholastic, and St. John Health carcass are actively using Web mining systems to answer important interrogatorys of Who? Why? and How? The benefits of integrating these systems A) are measured qualitatively in basis of customer satisfaction, but not measured using pecuniary or other quantitative measure. B) can be significant in legal injury of incremental financial growth and increase customer loyalty and satisfaction. C) have not yet outweighed the costs of the Web mining systems and analysis.D) can be infinitely measurable. Answer B Diff 3Page Ref 222 41) ________ is the semi-automated process of extracting patterns from large amounts of unstructured data sources. Answer Text mining Diff 1Page Ref 192 42) ________ is the process of identifying valid, novel, potentially useful, and ultimately go throughable patterns in data stored in structured databases, where the data are organized in records structured by categorical, ordinal, or continuous variables. Answer Data mining Diff 1Page Ref 192 43) ________ is the grouping of analogous documen ts without having a predefined set of categories.Answer Clustering Diff 2Page Ref 193 44) In linguistics, a(n) ________ is a large and structured set of texts prepared for the purpose of conducting knowledge discovery. Answer corpus Diff 1Page Ref 193 45) ________ is the process of reducing inflected words to their base or root form. Answer Stemming Diff 1Page Ref 193 46) ________ words or noise words are words that are filtered out prior to or after processing of natural language data. Answer Stop Diff 1Page Ref 193 47) The term stop-words are used by text mining to ________ ordinarily used words.Answer eliminate Diff 2Page Ref 193 48) ________ is an important component of text mining and is a subfield of artificial intelligence and computational linguistics. It studies the problem of understanding the natural human language. Answer Natural language processing (NLP) Diff 1Page Ref 196 49) ________ analysis is a technique used to detect favorable and unfavorable opinions toward spe cific products and services using textual data sources, such as customer feedback in Web postings and the detection of unfavorable rumors. Answer Sentiment Diff 2Page Ref 197 0) At a very high level, the first of three consecutive tasks in the text mining process is to establish the ________, which is a list of organized documents. Answer corpus Diff 1Page Ref 207 51) In the text mining process, the payoff of task two is a level file called a ________ matrix where the cells are populated with the term frequencies. Answer term-document Diff 3Page Ref 207 52) One of the main approaches to text classification is ________ in which an experts knowledge is encoded into the system each declaratively or in the form of adjectival classification rules.Answer knowledge engineering Diff 2Page Ref 211 53) A(n) ________ is one or more Web pages that provide a collection of links to authoritative pages. Answer hub Diff 1Page Ref 219 54) ________ mining is the process of extracting useful infor mation from the links embedded in Web documents. Answer Web structure Diff 2Page Ref 219 55) ________ mining is the extraction of useful information from data generated through Web page visits and transactions. Answer Web usage Diff 2Page Ref 220 56) epitome of the information collected by Web servers can help fall in understand user behavior.Analysis of this data is called ________ analysis. Answer clickstream Diff 2Page Ref 220 57) ________ applications focus on who and how questions by forum and reporting direct feedback from site visitors, by benchmarking against other sites and offline channels, and by supporting heraldive modeling of future visitor behavior. Answer Voice of Customer Diff 2Page Ref 224 58) Web analytics, CEM, and VOC applications form the foundation of the Web site ________ ecosystem that supports the online business ability to positively influence in demand(p) outcomes. Answer optimization Diff 2Page Ref 224 9) The ________ model, which is one where multi ple sources of data describing the same creation are unifyd to increase the sense and richness of the resulting analysis, forms the framework of the Web site optimization ecosystem. Answer convergent administration Diff 3Page Ref 225 60) Fundamental to the optimization process is ________, gathering data and information that can then be transformed into overt analysis and recommendations for improvement using Web mining tools and techniques. Answer measurement Diff 3Page Ref 225 61) Compare and contrast text mining and data mining.Answer Text mining is the semi-automated process of extracting patterns (useful information and knowledge) from large amounts of unstructured data sources. Data mining is the process of identifying valid, novel, potentially useful, and understandable patterns in data stored in structured databases, where the data are organized in records structured by categorical, ordinal, or continuous variables. Text mining is the same as data mining in that it has the same purpose and uses the same processes, but with text mining the input to the process is a collection of unstructured data files such as Word documents, PDF files, and so on.Diff 2Page Ref 192 62) Why will computers probably not be able to understand natural language the same way and with the same accuracy that humans do? Answer Natural human language is vague for computers to understand and a true understanding of meaning requires extensive knowledge of a topic beyond what is in the words, sentences, and paragraphs. Diff 1Page Ref 196 63) NLP has successfully been utilise to a variety of tasks via computer coursemes to machine rifleally process natural human language that previously could only be do by humans.List three of the most popular of these tasks. Answer Any three of the following Information retrieval. The science of searching for relevant documents, finding specific information within them, and generating metadata as to their contents. Information extraction. A type of information retrieval whose goal is to automatically extract structured information from a certain domain, using machine- unclouded documents. Question answering. The task of automatically answering a question posed in natural language that is, producing a human-language answer when given a human-language question. Automatic summarization. The creation of a trim back version of a text document by a computer program that contains the most important points of the document. Natural language generation. Systems transform information from computer databases into readable human language. Natural language understanding. Systems convert samples of human language into more statuesque representations that are easier for computer programs to manipulate. Machine translation. The automatic translation of one human language to another. alien language reading. A computer program that assists a onnative language verbalizer to read a foreign language. Foreign language writing. A comput er program that assists a nonnative language user in writing in a foreign language. Speech recognition. Converts spoken words to machine-readable input. Text-to-speech. A computer program converts ruler language text into human speech. Text proofing. A computer program reads a proof copy of a text in order to detect and correct any errors. Optical character recognition. The automatic translation of images of handwritten, typewritten, or printed text.Diff 2Page Ref 199 64) follow a marketing application of text mining. Answer Text mining can be used to increase cross-selling and up-selling by analyzing the unstructured data generated by call centers. Text generated by call-center notes as come up as transcriptions of voice conversations with customers can be died by text mining algorithms to extract novel, actionable information round customers perceptions toward a companys products and services. Text mining is of import for customer relationship management (CRM).Companies can use text mining to analyze unstructured text data, combined with the relevant structured data extracted from organizational databases, to predict customer perceptions and subsequent purchasing behavior. Diff 2Page Ref 200 65) What is the primary purpose of text mining within the context of knowledge discovery? Answer The primary purpose of text mining within the context of knowledge discovery is to process unstructured (textual) data along with structured data, if relevant to the problem, to extract meaningful and actionable patterns for better decision reservation.Diff 1Page Ref 206 66) Diagram and rationalise the three-step text mining process. Answer picture Figure 5. 5 in the textbook. Diff 2Page Ref 207 67) List two options for managing or reducing the dimensionality (sizing) of the term-document matrix (TDM). Answer A domain expert goes through the list of name and eliminates those that do not make much sense for the context of the study. Eliminate terms with very few occu rrences in very few documents. Transform the matrix using singular value decomposition. Diff 3Page Ref 210 8) What are three of the challenges for effective and efficient knowledge discovery posed by the Web? Answer The Web is too big for effective data mining. Because of the sheer size of the Web, it is not feasible to set up a data warehouse to replicate, store, and integrate all of the data on the Web, making data collection and integration a challenge. The Web is too complex. The complexity of a Web page is far great than a page in a traditional text document collection. Web pages lack a unified structure.The Web is too dynamic. The Web is a highly dynamic information source. Not only does the Web grow rapidly, but its content is constantly being updated. The Web is not specific to a domain. The Web serves a broad diversity of communities and connects billions of workstations. Web users have very different backgrounds, interests, and usage purposes. The Web has everything. Only a small muckle of the information on the Web is really relevant or useful to mortal or some task. Diff 2Page Ref 217 9) influence the three main areas of Web mining and each areas source of information. Answer Web content mining refers to the extraction of useful information from Web pages. Source unstructured textual content of the Web pages, usually in HTML format. Web structure mining is the process of extracting useful information from the links embedded in Web documents. Source the URL links contained in the Web pages. Web usage mining is the extraction of useful information from data generated through Web page visits and transactions.Source the exposit description of a Web sites visits. Diff 2Page Ref 218 70) List three business applications of Web mining. Answer 1. Determine the lifetime value of clients. 2. Design cross-marketing strategies across products. 3. Evaluate promotional campaigns. 4. Target electronic ads and coupons at user groups based on user rile pattern s. 5. Predict user behavior based on previously learned rules and users profiles. 6. impersonate dynamic information to users based on their interests and profiles. Diff 2Page Ref 221

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.