Advances in Financial Machine Learning

Advances in Financial Machine Learning

  • Downloads:2185
  • Type:Epub+TxT+PDF+Mobi
  • Create Date:2021-04-17 14:56:05
  • Update Date:2025-09-07
  • Status:finish
  • Author:Marcos Lopez de Prado
  • ISBN:1119482089
  • Environment:PC/Android/iPhone/iPad/Kindle

Summary

Machine learning (ML) is changing virtually every aspect of our lives。 Today ML algorithms accomplish tasks that until recently only expert humans could perform。 As it relates to finance, this is the most exciting time to adopt a disruptive technology that will transform how everyone invests for generations。 Readers will learn how to structure Big data in a way that is amenable to ML algorithms; how to conduct research with ML algorithms on that data; how to use supercomputing methods; how to backtest your discoveries while avoiding false positives。 The book addresses real-life problems faced by practitioners on a daily basis, and explains scientifically sound solutions using math, supported by code and examples。 Readers become active users who can test the proposed solutions in their particular setting。 Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance。

Download

Reviews

Ayush

Gold, author has done a public service by sharing so much useful and mathematically grounded approaches。 I see alot of self referencing in research but what can be done? Prado really is best positioned to write this book as he is both a practitioner as well as academic。I'll read this book twice, this time with end of chapter exercises to finally nail the concepts down。Cons:The math is confusing at time and accompanying code makes python look like C++ Gold, author has done a public service by sharing so much useful and mathematically grounded approaches。 I see alot of self referencing in research but what can be done? Prado really is best positioned to write this book as he is both a practitioner as well as academic。I'll read this book twice, this time with end of chapter exercises to finally nail the concepts down。Cons:The math is confusing at time and accompanying code makes python look like C++ 。。。more

Thiago Marzagão

I don't have any background in finance, so a lot in this book was completely news to me。 I had never thought that you could use volume or dollar bars (as opposed to time bars), for instance。 I imagine finance folks have known that for a long time, but it completely blew my mind。 (Though there is some good pushback here: "when acting based on volume traded we may be too late already。") Similarly, I had never thought about how differentiating might eliminate signal。 I saw fractional differentiatio I don't have any background in finance, so a lot in this book was completely news to me。 I had never thought that you could use volume or dollar bars (as opposed to time bars), for instance。 I imagine finance folks have known that for a long time, but it completely blew my mind。 (Though there is some good pushback here: "when acting based on volume traded we may be too late already。") Similarly, I had never thought about how differentiating might eliminate signal。 I saw fractional differentiation in grad school, in passing, but I had never found real-world applications for it (probably because I haven't done a lot of work with time series)。 I finally understand how it can be useful。 Also, I had never thought that you could do cross-validation in a combinatorial way, with multiple test sets at a time。 That is something I will probably try in non-finance work too。Now, I wish de Prado had separated data and strategy more clearly。 Take chapter 13 - "Backtesting on Synthetic Data" -, for example。 You'd expect it to be about using Monte Carlos to generate synthetic market data (which you could then use to backtest your strategy)。 But no。 The synthetic data includes the trading rules themselves。 Why conflate both things like this? What if I just want to simulate market data, so that later I can use it to backtest whatever strategy I want? That might keep things simpler, clearer, easier to handle。 (Can we even call it backtesting if you're trying to optimize the parameters of your strategy? Aren't these different things? What am I missing here?)It's the same with the triple barrier labeling。 I get it - "Every investment strategy has stop-loss limits" (p。 44)。 But it conflates data and strategy, which not only makes the whole thing harder to reason about, but also introduces problems like leakage and low uniqueness, which de Prado then spends entire chapters explaining how to fix。 Those problems wouldn't exist in the first place if our labels were not dependent on our strategy。 The man is clearly a genius (and he knows that - the book has a Taleb-ish style at times), I'm sure he has good reasons for suggesting triple barrier labeling, but where is the evidence that all this additional complication pays off in terms of higher Sharpe or some other metric?I also wonder whether some of these choices - dollar bars vs volume bars, triple barrier labeling vs conventional labeling, etc - could be learned from the data。 I know, I know, overfitting。 But I wish de Prado had discussed this possiblity, even if to dismiss it。 (Maybe reinforcement learning is the final destination? Btw, it would be great if de Prado could add a chapter on reinforcement learning for finance in a future edition。)Finally, this is a super dense book (though de Prado warns you of that right in the beginning)。 Some passages I had to read 2, 3, 4 times to understand, and a few passages I just didn't understand at all。 I hadn't struggled so hard with a technical book since grad school。 The language could be more clear and precise。 The notation could be more consistent。 And examples with real-world data would help。 (There are lots of code snipets though, and they are super helpful。)All that said, this is an amazing book that anyone who wants to do algorithmic trading should read。 。。。more

Alireza Nourian

کتاب پر است از ایده‌های بسیار خوب برای معامله ماشینی که حاصل سال‌ها پژوهش نویسنده است。 در این زمینه، نوشته‌های عمیق کم پیدا می‌شود و این کتاب غنیمت است。 البته نوشتار کتاب واقعا ضعیف است و اصلا موضوعات را قابل فهم بیان نمی‌کند。

Tiago Quevedo Teodoro

Benchmark in the field。 Every quant in finance must carry this one in her pocket。 Detailed, in-depth and with code examples。 Certainly one of the best books in the field of finance overall。

Ferhat Culfaz

A recycle of many of his papers in book。 Has the cutting edge, but mix of very specific and at the same time very vague。 Very advanced text and assumes you have vast prior knowledge。 Very theoretical yet contains snippets of python code for implementation。 Good bibliography after each chapter。

David

The single most important point of the book is the characterization of the failure modes of systematic (quant) outfits, what almost never works and what he has seems at least sometimes work。 This is extremely useful and is possibly applicable to organizations outside of the systematic domain。 de Prado also has a paper covering much the same topics。Overall the book is useful since few are writing books like this。 I only wish more effort and time was put into it to increase the quality and output。 The single most important point of the book is the characterization of the failure modes of systematic (quant) outfits, what almost never works and what he has seems at least sometimes work。 This is extremely useful and is possibly applicable to organizations outside of the systematic domain。 de Prado also has a paper covering much the same topics。Overall the book is useful since few are writing books like this。 I only wish more effort and time was put into it to increase the quality and output。 The tone of the book is one which encourages rote learning and an ignorance of the general setting。Worth at least skimming since many people will read it and it does set out a language and some methods used in practice。 Like a lot of de Prado's work there are useful heuristics but deep understanding is not always on offer and seldom is an attempt made to create links between the problems and heuristic solutions。 Some of the code examples are of poor quality and the text itself is poorly formatted。 There are some groups like https://hudsonthames。org/ working to put together cleaned up and improved versions of the methods presented in the book (among other things) so do see them for reference。 。。。more

Will

I don't code but the text was pretty accessible。 I don't code but the text was pretty accessible。 。。。more

Tony Murray

Overall a decent textbook but one that I found too abstract to really dig into。 I’m sure for specific people it is great but as someone who is technically inclined, it just felt a bit too much about him referencing his papers and prior text。 I was honestly hoping to be able to translate some of the code snippets from python into R, but the code was very sparsely commented。 I am working on a couple of simulations that the author coded and hope to get those translated。 So overall it was a 4 star b Overall a decent textbook but one that I found too abstract to really dig into。 I’m sure for specific people it is great but as someone who is technically inclined, it just felt a bit too much about him referencing his papers and prior text。 I was honestly hoping to be able to translate some of the code snippets from python into R, but the code was very sparsely commented。 I am working on a couple of simulations that the author coded and hope to get those translated。 So overall it was a 4 star book。 。。。more

Jaume Sués Caula

Not a recommended reading if you are starting up at quantitative trading。 The technical depth is astonishing, with great real-life examples。In my case, I wanted to immerse myself to get the argot and a sense of the complexity of this world (just after reading Jim Simmons biography)。

Tadas Talaikis

For interested, here is package based on this book。 For interested, here is package based on this book。 。。。more

Ifor Williams

To date, best book on ML for trading - by far。

Jason Orthman

Very difficult book to rate and review as it’s effectively a text book for advanced participants in the field of coding (Python) and financial machine learning。 The concepts and principles are still important。 There is no easy win for fund managers who want to utilise financial machine learning to attain alpha。 You will need a highly experienced team of skilled professionals across finance, coding, mathematics etc that will continue to keep evolving while avoiding common problems such as over-fi Very difficult book to rate and review as it’s effectively a text book for advanced participants in the field of coding (Python) and financial machine learning。 The concepts and principles are still important。 There is no easy win for fund managers who want to utilise financial machine learning to attain alpha。 You will need a highly experienced team of skilled professionals across finance, coding, mathematics etc that will continue to keep evolving while avoiding common problems such as over-fitting, back-testing etc 。。。more

Randy Carlson

Not bad。 Very technical on both the finance end and the technical end。

Oleksandr Nikitin

Given the overall sad state of the literature in this area, it's good。 Also, it's entertaining。 Just don't expect it to be a guide of any kind。 Given the overall sad state of the literature in this area, it's good。 Also, it's entertaining。 Just don't expect it to be a guide of any kind。 。。。more

Ben

Application of ML algorithms to financial data is straightforward, at least in a technical sense。Practically, God (or the devil) is in the details。

Azam

Excellent book with practical example and issues in financial machine learning

Kirill

This book contains an overview of tricks and techniques useful for time series analysis。 I bet you do not know at least 10 of them even if you work with time series on a daily basis。 Almost every mathematical description is accompanied by a code sample and this is a gem that gives this book real value。 It would be great if other books in ML had same level of reproducibility AND mathematical rigor。

Denis Vasilev

Практические советы по применению МЛ в торговле на фондовых рынках。 Все по делу, очень интересно было глянуть на основные вопросы работы на одном из самых конкурентных рынков。

BCS

Machine Learning is about gaining confidence in your algorithm。 Looking at a financial trading model, you only get a limited amount of data from, for example, Bloomberg services on which to formulate confidence。 Drilling down you may approximate third party transactions on which you can only obtain partial viability。 In this book we look at the various factors that obscure a supply data model and which therefore reduce the information that may be derived。 Given a large and diverse supply populat Machine Learning is about gaining confidence in your algorithm。 Looking at a financial trading model, you only get a limited amount of data from, for example, Bloomberg services on which to formulate confidence。 Drilling down you may approximate third party transactions on which you can only obtain partial viability。 In this book we look at the various factors that obscure a supply data model and which therefore reduce the information that may be derived。 Given a large and diverse supply population, backtesting becomes a crucial retrospective that may give pointers to trading forecasts, but they are only pointers; looking backwards is at best simple guide forecasting。 However, there are several ways of analysing supply data for subsequent information。Having gained separate PhDs in Financial Economics and Mathematical Finance, and holding multiple patent applications on algorithmic trading, our Dr and one-time academic Marcos Lopez de Prado now manages several multibillion-dollar funds using ML algorithms。Complex, often inter-related topics are covered with a simplicity that only comes from mastery of the subject areas, useful to every data analyst and business analyst supporting risk management。 Being a proliferate author, Lopez often references back to his prior publications。 A matrix of each topic - Financial Data, Software, Hardware, Math (well, he is American!), Meta-Strat and Overfitting are each chapter and part within the book。 However, defining the Sharpe Ratio and its common derivatives such as Deflated Sharpe Ratio after its extensive earlier use is a presentational faux pas。Standard industry financial risk models come with copious programming snippets in Python。 Ultimately, terms such as molecules and atoms are used when trying to illustrate parallel (python) programming。 Using KISS (Keep it Simple and Stupid) methodology, should Python Go?Several elementary tools are introduced to visualise supply market data such as Time Bars, Tick Bars, Volume Bars and Dollar Bars and then one vertical and two horizontal barriers, information derivative bars and then Multi-Threaded Monte Carlo。Cross-validation (CV) splits supply data into either Training or Testing pools to assist in model development and backtesting。 K-fold CV, a popular model, being considered to be faulty, Lopez uses hyper-parameters in his own Purged k-fold CV to improve leakage using purging and embargoes。Backtesting, (i。e。 Stress Testing) is considered from several viewpoints。 Three major Walk-Forward (WF) disadvantages such as a single scenario can easily be overfit, WF is not normally representative of future performance, and that its initial decisions are made on a limited portion of the total sample are each considered。 Arguing against the benefits of WF, Lopez concludes the goal is to infer future performance from a number of out-of-sample scenarios。 Extending his Purged k-fold principles, Lopez offers his Combinatorial Purged Cross-Validation method (CPCV), claiming it leads to fewer false discoveries, easily defeating WF overfitting。Strategic risk and then portfolio risk are well covered, though this is a relatively stable topic with few recent advances。Basics covered, we then focus on Asset Allocation, with increasing use of GCE A Level mathematics, looking at Markowitz’s Curse, Tree Clustering, Out-of-Sample Monte Carlo Simulations, Inverse Variance Allocation and others。 Shannon’s Entropy and other financial applications of entropy then follow, as does a review of Microstructural Feature publications, including various Lamda’s such as Kyle’s and Hasbrouck’s。Simple illustrations conclude with brute force used in Quantum Computing to find optimum solutions by examining all feasible solutions at the same time。 Considering the maturity of Quantum Computing, the amount of advances especially to standard models seems somewhat lacking。Finally, Dr Kesheng Wu and Dr Horst Simon look at Hierarchical Data Format 5 (HDF5), Supernova hunting, and High Performance Computing (HPC) against Cloud computing, reasoning HPC offers better cost effectiveness and higher performance。 Several use cases are presented including Intraday Peak Electricity Usage, the latter providing recent interesting insights (2014) applied to a summer time study of American Advanced Metering Infrastructure (AMI), the Flash Crash of 2010 and High Frequency Events with Non-uniform Fast Fourier Transform used in the natural gas futures market。Advances in Financial Machine Learning is a very interesting book。 I would give it 8 out of 10 - the author knows his subject。Review by Paul RamsayOriginally published: https://www。bcs。org/content/conWebDoc。。。 。。。more

Max Bolingbroke

Read his free paper on hierarchical risk parity (SSRN 2708678) instead。

IOANNIS TSIOKOS

Knowledge like this is hard to come by because it is much more profitable to implement it than to write about it。 Marcos must have had an urge to share his knowledge that overwhelmed the common wisdom in this industry - to not share or sell anything that works。

Terran M

This book is for people who already understand machine learning or predictive modeling, and who already understand investment, and would like some guidance on applying the one to the other。 It is an excellent book if and only if you meet these conditions。The author has a hint of Taleb-style arrogance, wanting to be recognized for being the smartest person in the room, but not enough to impede enjoyment of the book, and it answers the question of why he published it at all in a field which is oth This book is for people who already understand machine learning or predictive modeling, and who already understand investment, and would like some guidance on applying the one to the other。 It is an excellent book if and only if you meet these conditions。The author has a hint of Taleb-style arrogance, wanting to be recognized for being the smartest person in the room, but not enough to impede enjoyment of the book, and it answers the question of why he published it at all in a field which is otherwise characterized by "those who know do not say。" 。。。more

Kyle

If you're coming from a computer science and/or machine learning background, you will learn a lot about how to frame your algorithmic thinking in the domain of finance and will leave you hungry for more hardcore graph theory, parallelization, machine learning (beyond simple random forest ensembles and clustering), advanced algorithms, and gutty details of implementation, which are left for you to explore and enjoy。 The purpose of this book is not to explain how to apply Deep Learning to make mon If you're coming from a computer science and/or machine learning background, you will learn a lot about how to frame your algorithmic thinking in the domain of finance and will leave you hungry for more hardcore graph theory, parallelization, machine learning (beyond simple random forest ensembles and clustering), advanced algorithms, and gutty details of implementation, which are left for you to explore and enjoy。 The purpose of this book is not to explain how to apply Deep Learning to make money, but rather to lay a solid foundation of how to invest in a scientifically rigorous fashion given the modern machine learning toolset and access to PBs of data。 In many cases, rather than focussing on the specifics of any given model, Dr。 Lopez de Prado focuses on generating and selecting useful features。 The book, which is a hybrid of a textbook and a manual, explains using both formal mathematics and empirical evidence why many of the assumptions about Machine Learning applied to the financial world are wrong and follows through with rigorous and practical solutions。 For example, one of the most common false assumptions addressed in the book is that of IID samples in financial time series data。 Dr。 Lopez de Prado manages to pull together ideas from a wide spectrum of academic disciplines including mathematics, econometrics, machine learning, computer science, information theory, and physics to build a strong scientific basis upon which to algorithmically invest。 Despite the diversity of subject matter, the book progresses well, building on and reusing early themes and then exploring domain specific topics like market microstructure and quantum computing。 Source code to implement many of the methods is provided as a practical toolkit to test out the claims presented。 The thorough use of references is particularly helpful as it keeps the content fairly short and to the point。 Speed reading not recommended。 Using a programming analogy, the mathematical notation is more reminiscent of the explicit verbosity of C++ than that of python (which is used in the book and is meant to be concise)。 It's not much of a problem but be aware the information content is dense。Something that's mentioned but not explored is how to make use of “alternative datasets”。 Given many of the advances in the wider realm of ML have been around data you don’t get from exchanges, it would be nice if some helpful pointers or references for dealing with alternative data were included。 That said, it's not the end of the world given the wealth of resources online for analyzing text, image, and video data。Buy this book if you're an experienced programmer getting into Finance or a Financial Professional looking to strengthen your algorithmic understanding。 It is densely packed with a wealth of practical methods and breaks down and offers alternatives to faulty investing science。 。。。more