下载安卓APP箭头
箭头给我发消息

客服QQ:3315713922
读书 > 编程语言 >Python > Python数据分析(影印版)

Python数据分析(影印版)

综合评级:★★★★★

定价:68.00

作者:(印尼)伊德里斯(Idris,I.) 著

出版社:东南大学出版社

出版日期:2016年1月

页数:329

字数:426000

ISBN:9787564160647

书籍介绍
Python是一种多范式的编程语言,既适合面向对象的应用开发,也适合函数式设计模式。Python已然成为数据科学家们在数据分析、可视化和机器学习方面的**语言,它可以带来高效率和高生产力。

  伊德里斯所*的《Python数据分析(影印版)(英文版)》将教会初学者如何发掘Python的*大潜力用于数据分析,包括从数据获取、清洗、操作、可视化以及存储到复分析和建模等一切相关主题。它聚焦于一系列开源Python模块,比如NumPy、SciPy、matplotlib、pandas、IPython、Cython、scikit-learn以及NLTK等。在后面的章节里,本书涵盖了数据可视化、信号处理与时间序列分析、数据库、可预测分析及机器学习等主题。该书可以让你分分钟变成**数据分析师。

相关课程
目录
Preface

Chapter 1: Getting Started with Python Libraries

Software used in this book

Installing software and setup

On Windows

On Linux

On Mac OS X

Building NumPy SciPy, matplotlib, and IPython from source

Installing with setuptools

NumPy arrays

A simple application

Using IPython as a shell

Reading manual pages

IPython notebooks

Where to find help and references

Summary

Chapter 2: NumPy Arrays

The NumPy array object

The advantages of NumPy arrays

Creating a multidimensional array

Selecting NumPy array elements

NumPy numerical types

Data type objects

Character codes

The dtype constructors

The dtype attributes

One-dimensional slicing and indexing

Manipulating array shapes

Stacking arrays

Splitting NumPy arrays

NumPy array attributes

Converting arrays

Creating array views and copies

Fancy indexing

Indexing with a list of locations

Indexing NumPy arrays with Booleans

Broadcasting NumPy arrays

Summary

Chapter 3: Statistics and Linear Algebra

NumPy and SciPy modules

Basic descriptive statistics with NumPy

Linear algebra with NumPy

Inverting matrices with NumPy,

Solving linear systems with NumPy

Finding eigenvalues and eigenvectors with-NumPy

NumPy random numbers

Gambling with the binomial distribution

Sampling the normal distribution

Performing a normality test with SciPy

Creating a NumPy-masked array

Disregarding negative and extreme values

Summary

Chapter 4: pandas Primer

Installing and exploring pandas

pandas DataFrames

pandas Series

Querying data in pandas

Statistics with pandas DataFrames

Data aggregation with pandas DataFrames

Concatenating and appending DataFrames

Joining DataFrames

Handling missing values

Dealing with dates

Pivot tables

Remote data Access

Summary

Chapter 5: Retrieving, Processing, and Storing Data

Writing CSV files withNumPy and pandas

Comparing the NumPy .npy binary format and pickling

pandas DataFrames

Storing data with PyTables

Reading and writing pandas DataFrames to HDF5 stores

Reading and writing to Excel with pandas

Using REST web services and JSON

Reading and writing JSON with pandas

Parsing RSS and Atom feeds

Parsing HTML with Beautiful Soup

Summary

Chapter 6: Data Visualization

matplotlib subpackages

Basic matplotlib plots

Logarithmic plots

Scatter plots

Legends and annotations

Three-dimensional plots

Plotting in pandas

Lag plots

Autocorrelation plots

Plot.ly

Summary

Chapter 7: Signal Processing and Time Series

statsmodels subpackages

Moving averages

Window functions

Defining cointegration

Autocorrelation

Autoregressive models

ARMA models

Generating periodic signals

Fourier analysis

Spectral analysis

Filtering

Summary

Chapter 8: Working with Databases

Lightweight access with sqlite3

Accessing databases from pandas

SQLAIchemy

Installing and setting up SQLAIchemy

Populating a database with SQLAIchemy

Querying the database with SQLAIchemy

Pony ORM

Dataset - databases for lazy people

PyMongo and MongoDB

Storing data in Redis

Apache Cassandra

Summary

Chapter 9: Analyzing Textual Data and Social Media

Installing NLTK

Filtering out stopwords, names, and numbers

The bag-of-words model

Analyzing word frequencies

Naive Bayes classification

Sentiment analysis

Creating word clouds

Social network analysis

Summary

Chapter 10: Predictive Analytics and Machine Learning

A tour of scikit-learn

Preprocessing

Classification with logistic regression

Classification with support vector machines

Regression with ElasticNetCV

Support vector regression

Clustering with affinity propagation

Mean Shift

Genetic algorithms

Neural networks

Decision trees

Summary

Chapter 11: Environments Outside the Python Ecosystem and Cloud Computing

Exchanging information with MATLAB/Octave

Installing rpy2

Interfacing with R

Sending NumPy arrays to java

Integrating SWIG and NumPy

Integrating Boost and Python

Using Fortran code through f2py

Setting up Google App Engine

Running programs on PythonAnywhere

Working with Wakari

Summary

Chapter 12: Performance Tuning, Profiling, and Concurrency

Profiling the code

Installing Cython

Calling C code

Creating a process pool with multiprocessing

Speeding up embarrassingly parallel for loops with Joblib

Comparing Bottleneck to NumPy functions

Performing MapReduce with Jug

Installing MPI for Python

IPython Parallel

Summary

Appendix A: Key Concepts

Appendix B: Useful Functions

matplotlib

NumPy

pandas

Scikit-learn

SciPy

scipy.fftpack

scipy.signal

scipy.stats

Appendix C: Online Resources

Index

热门图书
推荐新闻
技术文库
论坛推荐