Statistical Physics with R: Ising Model with Monte Carlo

98 points by northlondoner 12 hours ago

epistasis 7 hours ago

Does anybody have a reason this particular repo might be of special interest today? It appears to be the methods from something published a decade ago, and has had a handful of commits since then, including yesterday. There's not a ton of stars on the github, but certainly more stars than most scientific methods get.

northlondoner 7 hours ago

Great point. Hope the following add a context that was missing:
Apart from intellectual appeal,
(1) there was a new paper from Google about quantum ergodicity, see https://doi.org/10.48550/arXiv.2506.10191 . So in general tech community can benefit a lot from understanding ergodicity via this package and see hands on how it is implemented, see Vignette as well, https://cran.r-project.org/web/packages/isingLenzMC/vignette...
(2) The repo is part of ergodicity research that is now revisited from classical point of view. actually new commits are significant, a new dataset is generated. See, https://zenodo.org/records/17151290 , so reproducibility is amazing even after so many years.
- epistasis 6 hours ago
  
  Thanks! This is a bit beyond my depth, and I wouldn't have figured anything out without this helpful comment. But some spin glass modeling ideas were very useful for a paper during my grad school days, so I think it's time to revisit it to see if I can steal any more great methods.
random3 7 hours ago

No idea. But curious- why would something that was published a decad or longer ago not be of interest today?
- epistasis 7 hours ago
  
  The commit yesterday, on a very quiet repo, made me think there might be something more going on, that might be obvious to those who study the topic but not so obvious to a broader audience.
  Also, older papers can be of interest but don't usually make it to the front page of a general audience news site unless there's something bigger going on that gave it renewed general interest.
- randmeerkat 7 hours ago
  
  > No idea. But curious- why would something that was published a decad or longer ago not be of interest today?
  Because if it isn’t in the “hype” it’s worthless, obsolete, trash…
  Welcome to the new world of tech, that warms its hands by burning the old world.
  Enjoy the vibes…
  - oersted 6 hours ago
    
    Just wanted to stop by and say: cool writing.
    I don't quite agree, rather melodramatic, but it really paints a picture.
chermi 7 hours ago

Edit Nvm I was being a grump. I'm glad people like this sort of work
- epistasis 6 hours ago
  
  HN is highly stochastic, and honestly I don't think there's any way around it.
  The upvoting scheme can not distinguish between a hot topic of interest to 10% of readers, that deliver 10 upvotes in the first 10 minutes, from a more niche topic of interest to 5% of readers, that gets 10 upvotes in 10 minutes.
  At least it can't distinguish at that time. So things go to the front page, and future votes determine what happens!
  But that initial "on the front page" boost is a nonlinearity that many good posts do not get through.
  Personally, I really liked this post, and was merely asking because I was very surprised others liked it too!

DrNosferatu 9 hours ago

Anyone can recommend a good and straightforward to understand -- general -- tutorial or book in Monte Carlo methods, for beginners?

oersted 8 hours ago

The "Monte Carlo" term is deceivingly fancy, most algorithms with that label simply involve randomly picking from a set of possibilities (search space), as opposed to trying all cases or exploring in some other systematic way.
When you try do that for real problems, it can sometimes be difficult to sample from complex probability distributions/models efficiently in a way that is representative. There are lots of tricks around that, like most topics it's a black-hole of details. But it still boils down to randomly testing options.
Look at the source code, even in C it's really short and simple: https://github.com/msuzen/isingLenzMC/blob/master/src/isingL...
Statisticians like to do this kind of intellectual inflation, there are many such scary terms with simple meanings: "Markov Chain" is a process who's next state depends only on the current state, "stochastic" is a straight-up synonym for "random"... Illegitimi non carborundum!
dynm 8 hours ago

I think there is a "correct" answer to this question! The best book on Monte Carlo methods is this one: https://artowen.su.domains/mc/
It's not published yet, but already a classic. (Might be more intermediate than beginner, though.)
For something a bit more gentle, I also recommend chapter 29 of this book: https://www.inference.org.uk/mackay/itila/book.html
quag 8 hours ago

I recommend starting here: https://youtu.be/nKCT-Cdk0xY
Once you understand and use this approach, you can figure out most other approaches you need to use.
the-mitr 8 hours ago

this might be elementary but is a good introduction
https://archive.org/details/TheMonte-carloMethodlittleMathem...

northlondoner 12 hours ago

R ecosystem provides amazing reproducible research ecosystem, even for statistical physics.

Qem 12 hours ago

I wonder how close R was to also take over the scientific computing/machine learning space, instead of Python's numpy/scipy ecosystem.
- teruakohatu 10 hours ago
  
  I love and use R, but it never became the dominant ML in part because it has three (or more) different object systems and many libraries sort of use their own style.
  This makes it seem a bit disjointed, in a way that other languages don’t.
  The R community should have anointed one object system and made tidyverse a core part of R.
  All that said, R is fantastic and the depth of libraries is extensive. Libs are often written by the original researchers that develop the method. At some academic institutions an R package is counted as a paper.
  - paddleon 7 hours ago
    
    > The R community should have anointed one object system
    > and made tidyverse a core part of R.
    Not a tidyverse fan. It doesn't scale well.
    Learn data.table, which has a much more R-like interface and is fast fast fast even for large data sizes. More powerful and more expressive than pandas, and again, faster
    See https://cran.r-project.org/web/packages/data.table/vignettes...
    
    mscbuck 2 hours ago
    
    And if you still prefer the language of tidyverse, use tidytable and you get the best of both worlds!
  - tylermw 9 hours ago
    
    The developing S7 object system (https://github.com/RConsortium/S7) is looking fairly promising in that it combines many of the nice properties of S3 and S4 (validation, multiple dispatch, sane constructors) while still being fairly simple and straightforward to use.
    
    northlondoner 2 hours ago
    
    Excellent news. Quite promising, but R's power is its been actually functional natively. Even binary operations are functions, `+`(x,y) would work as in x+y
  - mvieira38 9 hours ago
    
    Agree 100% on tidyverse becoming part of the standard library. Some of the language's greatest libraries (like Hyndman's forecasting stuff) basically assume you're using tidyverse already
  - clircle 3 hours ago
    
    I have a feeling that most data scientists using R have no need to touch any of the object systems, hard to believe that would be a deal breaker.
- cactusfrog 9 hours ago
  
  The issue with R is that there is too much dsl. This is great for one-off analysis but makes building a cohesive large code base really difficult.
  - UpsideDownRide 8 hours ago
    
    Yeah that's def part of it. As fun as it is there is just too much of it and people jump for it too readily, tidyverse included.
- mhog_hn 10 hours ago
  
  One general purpose web framework away
  - rjdj377dhabsn 10 hours ago
    
    I disagree. R is just not a very nice language.
    It has some really great statistical and data science packages that were well ahead of the competition 10-15 years ago. The web frameworks were good enough for dashboards and what most people were using R for.
    But if you wanted to write fast and elegant nom-vectorized code, R is really lacking. I left it for Julia for that reason.
    
    mvieira38 9 hours ago
    
    How is Julia in terms of data science dev experience? Nothing ever felt as good as the R+tidyverse combo to me, at least in Python.
    
    rjdj377dhabsn 9 hours ago
    
    Julia is pretty good at basic data science. Working with dataframes is comparable to R's data.tables with the benefit that I don't need to switch languages if I want to run a fast loop over some data as part of a calculation or use a custom data structure.
    I'm not a fan of pandas, so I'd say Julia and R beat python at basic dataframe manipulation. Nothing beats kdb+/q at dataframes though imo.
    
    mvieira38 2 hours ago
    
    Have you tried Polars in Python? When you get going it's pretty similar to tidyverse, except you're chaining methods instead of piping, and it's lazily evaluated + parallel because of the underlying Rust engine. IME it's tidyverse > polars > pandas > data.table in terms of ergonomics
    
    mhogers 8 hours ago
    
    I agree somewhat with you - nonetheless a FastAPI + Alembic + SQLAlchemy alternative in R would make it possible to use it as a general purpose language
- shiandow 10 hours ago
  
  In statistical physics they still use C a lot, as far as I know.
  - northlondoner 2 hours ago
    
    Good observation. IsingLenzMC indeed core is written in C. R provides great C interfacing facilities.
- mamami 10 hours ago
  
  It was never close. Its synthax is unintuitive and painful to learn as a science undergrad. If it hadn't been python it would have been another language.
  - physicsguy 9 hours ago
    
    Python's rapid adoption really came out of NumPy, SciPy, Matplotlib copying the interfaces from MATLAB, which was very widely used before but obviously had a cost associated.
  - UpsideDownRide 8 hours ago
    
    This is obviously a personal thing but tidyverse syntax is great and lends itself very well to clear and concise data operations.
- larrydag 10 hours ago
  
  Very close. In fact you could still say that it still is competing with Python for users. There is still an active community of developers.
- 3abiton 10 hours ago
  
  R is really not for production deployment. It lacks a lot of what made python popular, and its target users were radically different.
  - shoo 9 hours ago
    
    R was developed for and by statisticians, for better and worse. I used R a little bit 15-20 years ago, what I remember was that quite a few libraries and function interfaces seemed to be designed to be convenient for interactive use, but if you tried to use them in an automated script, e.g. some analysis you wanted to scale up and repeat 10,000 times while bootstrap sampling or hyperparameter sweeping or what have you, those same library and interface design choices involved bizarre edge cases where functions would sometimes do something completely different (perhaps changing the return type) when invoked with slightly different arguments. All these automation hostile edge causes were annoying to discover and then work around.
    None of this was forced by R the language, it was purely a library design thing by the folks writing the libraries. Whereas in contrast, you simply wouldn't and didn't get such library design in mainstream general purpose programming languages (e.g. in C++, java some of this stuff wouldn't even type check) and similarly in python, even though python being dynamic was fertile ground for people to develop completely bonkers and unautomatable numeric and scientific libraries, the customs for how libraries should work were different
    This is maybe just a reflection that R and R's libraries were being designed for interactive use by humans doing exploratory data analysis, model fitting etc, unlike other programming languages which are used to automate things or build software products that can be shipped.
  - UpsideDownRide 8 hours ago
    
    It's general purpose and really there is no issue with doing production with it really outside of the mindset and the lispy nature of it. Source - was working on R in production for financial sector.
  - mscbuck 4 hours ago
    
    This is really a non issue now. R's problem back in the day was that it was really specialized in analysis and interactivity, but a lot of the general purpose stuff that made Python popular is now easily achievable in R and well-developed and maintained. RestRServe and Plumber are both excellent tools for REST APIs.
  - dkga 9 hours ago
    
    Completely disagree. I work at a central bank, helping people make some of the most important economic decisions in my country and plenty of analyses are done purely with R.
    
    esafak 9 hours ago
    
    Were they run in production as nightly jobs or something?
    
    melenaboija 9 hours ago
    
    It is used in finance and banking to build statistical models for research not for deployments in production in the technical sense, I hope.

evanb 9 hours ago

If you're interested in HMC, we showed how to apply it to the Ising model in https://arxiv.org/abs/1912.03278 with code available in https://github.com/HISKP-LQCD/ising_hmc

nakamoto_damacy 9 hours ago

Does the use of "Statistical Physics" as opposed to "Statistical Mechanics" indicates a European author or a broader scope?

whyever 7 hours ago

They are synonyms.
- chermi 7 hours ago
  
  No, they are not. Statistical mechanics is a theory, statistical physics is a field.
- nakamoto_damacy 7 hours ago
  
  Physics and Mechanics are not synonyms. The latter is a small subset of the former.
  - whyever 6 hours ago
    
    Yes, but this relation does but apply to statistical mechanics and statistical physics, they mean the same: https://en.wikipedia.org/wiki/Statistical_mechanics
    What is included in "statistical physics" that is not included in "statistical mechanics"?
    
    chermi 4 hours ago
    
    Kinetic theory stuff for one, like deposition, growth, sandpile type things. Complex networks and lots of dynamics stuff falls under statistical physics umbrella but not statistical mechanics. Stat mech's amazingly wide applicability makes it easy to think it's THE approach to approaching things statistically, but it's not. The broad encompassing approach has a name, statistical physics.
    
    northlondoner 5 hours ago
    
    There is a distinction. Usually statistical mechanics means the ensemble theory and partition functions that connects microscopic systems to macroscopic ones from material point of views. However, statistical physics is a bit more generic, for example complex networks may not use ensemble theory or partition functions and could use only statistics on the network, such as average neighbourhood or similar.
    
    kgwgk 4 hours ago
    
    People have also used “statistical physics” to refer to the former concept since forever. For example Landau.
    “Statistical mechanics” is also used in a broad sense, just like “quantum mechanics” is often used for anything “quantum”.
    
    nakamoto_damacy 3 hours ago
    
    What I'm getting from this discussion is that we use Statistical Physics to refer to anything covered by Statistical Physics AND Statistical Mechanics, while we use Statistical Mechanics in a narrower context, but it is also possible that some use SM loosely.
    
    kgwgk 23 minutes ago
    
    > it is also possible that some use SM loosely
    I think it’s frequent. For example: https://teach-me-codes.github.io/computational-physics/the_p...

techlatest_net 6 hours ago

[dead]

candseven 7 hours ago

[dead]

revanwjy 5 hours ago

[flagged]

frustratedSpin 8 hours ago

Why is this worth posting? Simulating a 1D Ising model is a homework exercise for undergrads.

northlondoner 7 hours ago

See above comment regarding ergodicity.
emil-lp 8 hours ago

Lucky 10,000.