Python for data science for dummies (Record no. 1688)

MARC details
000 -LEADER
fixed length control field 14271nam a22002537a 4500
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20220222160009.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 220222b ||||| |||| 00| 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9788126524938
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 005.133
Item number MUE
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name Mueller, John Paul
245 ## - TITLE STATEMENT
Title Python for data science for dummies
250 ## - EDITION STATEMENT
Edition statement 2nd
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Name of publisher, distributor, etc. Wiley India Pvt. Ltd.
Place of publication, distribution, etc. New Delhi
Date of publication, distribution, etc. 2021
300 ## - PHYSICAL DESCRIPTION
Extent xvi, 467 p.
365 ## - TRADE PRICE
Price type code INR
Price amount 699.00
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc. note Introduction<br/><br/>About This Book <br/><br/>Foolish Assumptions <br/><br/>Icons Used in This Book <br/><br/>Beyond the Book <br/><br/>Where to Go from Here <br/><br/> <br/><br/>Part 1: Getting Started With Data Science and Python<br/><br/>Chapter 1: Discovering the Match between Data Science and Python<br/><br/>Defining the Sexiest Job of the 21st Century<br/>Considering the emergence of data science<br/>Outlining the core competencies of a data scientist<br/>Linking data science, big data, and AI <br/>Understanding the role of programming<br/>Creating the Data Science Pipeline<br/>Preparing the data<br/>Performing exploratory data analysis<br/>Learning from data<br/>Visualizing<br/>Obtaining insights and data products<br/>Understanding Python's Role in Data Science<br/>Considering the shifting profile of data scientists<br/>Working with a multipurpose, simple, and efficient language<br/>Learning to Use Python Fast <br/>Loading data<br/>Training a model<br/>Viewing a result<br/> <br/><br/>Chapter 2: Introducing Python's Capabilities and Wonders<br/><br/>Why Python?<br/>Grasping Python's Core Philosophy<br/>Contributing to data science<br/>Discovering present and future development goals<br/>Working with Python<br/>Getting a taste of the language<br/>Understanding the need for indentation<br/>Working at the command line or in the IDE<br/>Performing Rapid Prototyping and Experimentation<br/>Considering Speed of Execution<br/>Visualizing Power<br/>Using the Python Ecosystem for Data Science<br/>Accessing scientific tools using SciPy<br/>Performing fundamental scientific computing using NumPy<br/>Performing data analysis using pandas<br/>Implementing machine learning using Scikit-learn<br/>Going for deep learning with Keras and TensorFlow<br/>Plotting the data using matplotlib<br/>Creating graphs with NetworkX<br/>Parsing HTML documents using Beautiful Soup<br/> <br/><br/>Chapter 3: Setting Up Python for Data Science<br/><br/>Considering the Off-the-Shelf Cross-Platform Scientific Distributions<br/>Getting Continuum Analytics Anaconda<br/>Getting Enthought Canopy Express<br/>Getting WinPython<br/>Installing Anaconda on Windows<br/>Installing Anaconda on Linux<br/>Installing Anaconda on Mac OS X<br/>Downloading the Datasets and Example Code<br/>Using Jupyter Notebook<br/>Defining the code repository<br/>Understanding the datasets used in this book<br/> <br/><br/>Chapter 4: Working with Google Colab<br/><br/>Defining Google Colab<br/>Understanding what Google Colab does<br/>Considering the online coding difference<br/>Using local runtime support<br/>Getting a Google Account<br/>Creating the account<br/>Signing in<br/>Working with Notebooks<br/>Creating a new notebook<br/>Opening existing notebooks<br/>Saving notebooks<br/>Downloading notebooks<br/>Performing Common Tasks<br/>Creating code cells<br/>Creating text cells<br/>Creating special cells<br/>Editing cells<br/>Moving cells<br/>Using Hardware Acceleration<br/>Executing the Code<br/>Viewing Your Notebook<br/>Displaying the table of contents<br/>Getting notebook information<br/>Checking code execution<br/>Sharing Your Notebook<br/>Getting Help<br/> <br/><br/>Part 2: Getting Your Hands Dirty With Data<br/><br/>Chapter 5: Understanding the Tools<br/><br/>Using the Jupyter Console<br/>Interacting with screen text<br/>Changing the window appearance<br/>Getting Python help<br/>Getting IPython help<br/>Using magic functions<br/>Discovering objects<br/>Using Jupyter Notebook<br/>Working with styles<br/>Restarting the kernel<br/>Restoring a checkpoint<br/>Performing Multimedia and Graphic Integration<br/>Embedding plots and other images<br/>Loading examples from online sites<br/>Obtaining online graphics and multimedia<br/> <br/><br/>Chapter 6: Working with Real Data<br/><br/>Uploading, Streaming, and Sampling Data<br/>Uploading small amounts of data into memory<br/>Streaming large amounts of data into memory<br/>Generating variations on image data<br/>Sampling data in different ways<br/>Accessing Data in Structured Flat-File Form<br/>Reading from a text file<br/>Reading CSV delimited format<br/>Reading Excel and other Microsoft Office files<br/>Sending Data in Unstructured File Form<br/>Managing Data from Relational Databases<br/>Interacting with Data from NoSQL Databases<br/>Accessing Data from the Web<br/> <br/><br/>Chapter 7: Conditioning Your Data<br/><br/>Juggling between NumPy and pandas<br/>Knowing when to use NumPy<br/>Knowing when to use pandas<br/>Validating Your Data<br/>Figuring out what's in your data<br/>Removing duplicates<br/>Creating a data map and data plan<br/>Manipulating Categorical Variables<br/>Creating categorical variables<br/>Renaming levels<br/>Combining levels<br/>Dealing with Dates in Your Data<br/>Formatting date and time values<br/>Using the right time transformation<br/>Dealing with Missing Data<br/>Finding the missing data<br/>Encoding missingness<br/>Imputing missing data<br/>Slicing and Dicing: Filtering and Selecting Data<br/>Slicing rows<br/>Slicing columns<br/>Dicing<br/>Concatenating and Transforming<br/>Adding new cases and variables<br/>Removing data<br/>Sorting and shuffling<br/>Aggregating Data at Any Level<br/> <br/><br/>Chapter 8: Shaping Data<br/><br/>Working with HTML Pages<br/>Parsing XML and HTML<br/>Using XPath for data extraction<br/>Working with Raw Text<br/>Dealing with Unicode<br/>Stemming and removing stop words<br/>Introducing regular expressions<br/>Using the Bag of Words Model and Beyond<br/>Understanding the bag of words model<br/>Working with n-grams<br/>Implementing TF-IDF transformations<br/>Working with Graph Data<br/>Understanding the adjacency matrix<br/>Using NetworkX basics<br/> <br/><br/>Chapter 9: Putting What You Know in Action<br/><br/>Contextualizing Problems and Data<br/>Evaluating a data science problem<br/>Researching solutions<br/>Formulating a hypothesis<br/>Preparing your data<br/>Considering the Art of Feature Creation<br/>Defining feature creation<br/>Combining variables<br/>Understanding binning and discretization<br/>Using indicator variables<br/>Transforming distributions<br/>Performing Operations on Arrays<br/>Using vectorization<br/>Performing simple arithmetic on vectors and matrices<br/>Performing matrix vector multiplication<br/>Performing matrix multiplication<br/> <br/><br/>Part 3: Visualizing Information<br/><br/>Chapter 10: Getting a Crash Course in MatPlotLib<br/><br/>Starting with a Graph<br/>Defining the plot<br/>Drawing multiple lines and plots<br/>Saving your work to disk<br/>Setting the Axis, Ticks, Grids<br/>Getting the axes<br/>Formatting the axes<br/>Adding grids<br/>Defining the Line Appearance<br/>Working with line styles<br/>Using colors<br/>Adding markers<br/>Using Labels, Annotations, and Legends<br/>Adding labels<br/>Annotating the chart<br/>Creating a legend<br/> <br/><br/>Chapter 11: Visualizing the Data<br/><br/>Choosing the Right Graph<br/>Showing parts of a whole with pie charts<br/>Creating comparisons with bar charts<br/>Showing distributions using histograms<br/>Depicting groups using boxplots<br/>Seeing data patterns using scatterplots<br/>Creating Advanced Scatterplots<br/>Depicting groups<br/>Showing correlations<br/>Plotting Time Series<br/>Representing time on axes<br/>Plotting trends over time<br/>Plotting Geographical Data<br/>Using an environment in Notebook<br/>Getting the Basemap toolkit<br/>Dealing with deprecated library issues<br/>Using Basemap to plot geographic data<br/>Visualizing Graphs<br/>Developing undirected graphs<br/>Developing directed graphs<br/> <br/><br/>Part 4: Wrangling Data<br/><br/>Chapter 12: Stretching Python's Capabilities<br/><br/>Playing with Scikit-learn<br/>Understanding classes in Scikit-learn<br/>Defining applications for data science<br/>Performing the Hashing Trick<br/>Using hash functions<br/>Demonstrating the hashing trick<br/>Working with deterministic selection<br/>Considering Timing and Performance<br/>Benchmarking with timeit<br/>Working with the memory profiler<br/>Running in Parallel on Multiple Cores<br/>Performing multicore parallelism<br/>Demonstrating multiprocessing<br/> <br/><br/>Chapter 13: Exploring Data Analysis<br/><br/>The EDA Approach<br/>Defining Descriptive Statistics for Numeric Data<br/>Measuring central tendency<br/>Measuring variance and range<br/>Working with percentiles<br/>Defining measures of normality<br/>Counting for Categorical Data<br/>Understanding frequencies<br/>Creating contingency tables<br/>Creating Applied Visualization for EDA<br/>Inspecting boxplots<br/>Performing t-tests after boxplots<br/>Observing parallel coordinates<br/>Graphing distributions<br/>Plotting scatterplots<br/>Understanding Correlation<br/>Using covariance and correlation<br/>Using nonparametric correlation<br/>Considering the chi-square test for tables<br/>Modifying Data Distributions<br/>Using different statistical distributions<br/>Creating a Z-score standardization<br/>Transforming other notable distributions<br/> <br/><br/>Chapter 14: Reducing Dimensionality<br/><br/>Understanding SVD<br/>Looking for dimensionality reduction<br/>Using SVD to measure the invisible<br/>Performing Factor Analysis and PCA<br/>Considering the psychometric model<br/>Looking for hidden factors<br/>Using components, not factors<br/>Achieving dimensionality reduction<br/>Squeezing information with t-SNE<br/>Understanding Some Applications<br/>Recognizing faces with PCA<br/>Extracting topics with NMF<br/>Recommending movies<br/> <br/><br/>Chapter 15: Clustering<br/><br/>Clustering with K-means<br/>Understanding centroid-based algorithms<br/>Creating an example with image data<br/>Looking for optimal solutions<br/>Clustering big data<br/>Performing Hierarchical Clustering<br/>Using a hierarchical cluster solution<br/>Using a two-phase clustering solution<br/>Discovering New Groups with DBScan<br/> <br/><br/>Chapter 16: Detecting Outliers in Data<br/><br/>Considering Outlier Detection<br/>Finding more things that can go wrong<br/>Understanding anomalies and novel data<br/>Examining a Simple Univariate Method<br/>Leveraging on the Gaussian distribution<br/>Making assumptions and checking out<br/>Developing a Multivariate Approach<br/>Using principal component analysis<br/>Using cluster analysis for spotting outliers<br/>Automating detection with Isolation Forests<br/> <br/><br/>Part 5: Learning From Data<br/><br/>Chapter 17: Exploring Four Simple and Effective Algorithms<br/><br/>Guessing the Number: Linear Regression<br/>Defining the family of linear models<br/>Using more variables<br/>Understanding limitations and problems<br/>Moving to Logistic Regression<br/>Applying logistic regression<br/>Considering when classes are more<br/>Making Things as Simple as Naïve Bayes<br/>Finding out that Naïve Bayes isn't so naïve<br/>Predicting text classifications<br/>Learning Lazily with Nearest Neighbors<br/>Predicting after observing neighbors<br/>Choosing your k parameter wisely<br/> <br/><br/>Chapter 18: Performing Cross-Validation, Selection, and Optimization<br/><br/>Pondering the Problem of Fitting a Model<br/>Understanding bias and variance<br/>Defining a strategy for picking models<br/>Dividing between training and test sets<br/>Cross-Validating<br/>Using cross-validation on k folds<br/>Sampling stratifications for complex data<br/>Selecting Variables Like a Pro<br/>Selecting by univariate measures<br/>Using a greedy search<br/>Pumping Up Your Hyperparameters<br/>Implementing a grid search<br/>Trying a randomized search<br/> <br/><br/>Chapter 19: Increasing Complexity with Linear and Nonlinear Tricks<br/><br/>Using Nonlinear Transformations<br/>Doing variable transformations<br/>Creating interactions between variables<br/>Regularizing Linear Models<br/>Relying on Ridge regression (L2)<br/>Using the Lasso (L1)<br/>Leveraging regularization<br/>Combining L1 & L2: Elasticnet<br/>Fighting with Big Data Chunk by Chunk<br/>Determining when there is too much data<br/>Implementing Stochastic Gradient Descent<br/>Understanding Support Vector Machines<br/>Relying on a computational method<br/>Fixing many new parameters<br/>Classifying with SVC<br/>Going nonlinear is easy<br/>Performing regression with SVR<br/>Creating a stochastic solution with SVM<br/>Playing with Neural Networks<br/>Understanding neural networks<br/>Classifying and regressing with neurons<br/> <br/><br/>Chapter 20: Understanding the Power of the Many<br/><br/>Starting with a Plain Decision Tree<br/>Understanding a decision tree<br/>Creating trees for different purposes<br/>Making Machine Learning Accessible<br/>Working with a Random Forest classifier<br/>Working with a Random Forest regressor<br/>Optimizing a Random Forest<br/>Boosting Predictions<br/>Knowing that many weak predictors win<br/>Setting a gradient boosting classifier<br/>Running a gradient boosting regressor<br/>Using GBM hyperparameters<br/> <br/><br/>Part 6: The Part of Tens<br/><br/>Chapter 21: Ten Essential Data Resources<br/><br/>Discovering the News with Subreddit<br/>Getting a Good Start with KDnuggets<br/>Locating Free Learning Resources with Quora<br/>Gaining Insights with Oracle's Data Science Blog<br/>Accessing the Huge List of Resources on Data Science Central<br/>Learning New Tricks from the Aspirational Data Scientist<br/>Obtaining the Most Authoritative Sources at Udacity<br/>Receiving Help with Advanced Topics at Conductrics<br/>Obtaining the Facts of Open Source Data Science from Masters<br/>Zeroing In on Developer Resources with Jonathan Bower<br/> <br/><br/>Chapter 22: Ten Data Challenges You Should Take<br/><br/>Meeting the Data Science London + Scikit-learn Challenge<br/>Predicting Survival on the Titanic<br/>Finding a Kaggle Competition that Suits Your Needs<br/>Honing Your Overfit Strategies<br/>Trudging Through the MovieLens Dataset<br/>Getting Rid of Spam E-mails<br/>Working with Handwritten Information<br/>Working with Pictures<br/>Analyzing Amazon.com Reviews<br/>Interacting with a Huge Graph<br/> <br/><br/>Index
520 ## - SUMMARY, ETC.
Summary, etc. About the Author<br/>John Mueller is a freelance author and technical editor. He has writing in his blood, having produced 99 books and more than 600 articles to date. The topics range from networking to home security and from database management to heads-down programming. During his time at Cubic Corporation, John was exposed to reliability engineering and has had a continued interest in probability ever since.<br/><br/> <br/><br/>Luca Massaron is a data scientist specialized in organizing and interpreting big data and transforming it into smart data by means of the simplest and most effective data mining and machine learning techniques. Because of his job as a quantitative marketing consultant and marketing researcher, he has been involved in quantitative data since 2000 with different clients and in various industries. Luca was able to quickly rank among the top 10 Kaggle data scientists.
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Python (Computer program language)
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Data mining
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Programming languages (Electronic computers)
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Data structures (Computer science)
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name Massaron, Luca and
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Book
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Collection code Bill No Bill Date Home library Current library Shelving location Date acquired Source of acquisition Cost, normal purchase price Total Checkouts Full call number Accession Number Date last seen Date checked out Cost, replacement price Price effective from Koha item type
    Dewey Decimal Classification     IT & Decisions Sciences IB/IN/1212 09-02-2022 Indian Institute of Management LRC Indian Institute of Management LRC General Stacks 02/22/2022 International Book Centre 489.30 3 005.133 MUE 001882 09/13/2023 08/09/2023 699.00 02/22/2022 Book

©2019-2020 Learning Resource Centre, Indian Institute of Management Bodhgaya

Powered by Koha