Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Stack Traces Are Underrated (karl.berlin)
96 points by zoidb on March 10, 2025 | hide | past | favorite | 56 comments


Stack traces are your #1 ally when supporting someone else's legacy production pile.

Once you get comfortable with how they work and what information they contain, you can hit the ground running anywhere. Stack traces will teach you about the product architecture faster than anyone on the team can.

As you embrace them, you take the little bit of extra time to make sure they go well. For example, re-throwing exceptions correctly, properly awaiting results, etc. Very minor details that make all the difference.

A broader outcome of this enlightenment is preference for monolithic products. Stack traces fare poorly across web service and API boundaries. If you've only ever worked with microservice architectures, the notion of a stack trace may seem distracting.


> A broader outcome of this enlightenment is preference for monolithic products. Stack traces fare poorly across web service and API boundaries. If you've only ever worked with microservice architectures, the notion of a stack trace may seem distracting.

Yes. People forget that the original concept of microservices, the AWS "everything must have an API", was to put in an accountability boundary across teams. Either the API behaves per its contract or it does not, you're neither expected nor really allowed to cross that boundary into the API to find out why it's doing that.

In an environment which is correctly doing "each microservice is a different small team", that helps. In an environment which is doing "one team maintains lots of microservices", this is nearly always an anti-pattern.


Accountability and transparency go hand in hand, though. Teams would be able to debug together much easier if stack traces were propagated across RPCs, and good RPC frameworks can do that. Unfortunately when (ab)using HTTP+JSON for RPCs, good cross-service debugging is one of the first casualities.


I have a hard time imagining what that means but it sounds like it could be helpful for a project at work. Can you point to a source that elaborates on error propagation on rpcs?


It means if thread A does an RPC to a remote service, and the thread B in that service that's handling the RPC takes an exception, the details are serialized and returned back to thread A where the exception is rebuilt and rethrown, ensuring that in a platform that supports chained exceptions and blocking threads efficiently (e.g. JVM) that you get exceptions and stack traces that cross services.

It's not hard to implement, it's just that the industry kinda gave up on using powerful RPC frameworks with language integration. gRPC is about the fanciest you'll get and that is a clone of Stubby, which was originally written for C++ in a codebase that banned the use of C++ exceptions entirely. So it doesn't try to solve that problem.


> the industry kinda gave up on using powerful RPC frameworks with language integration

The trouble is that ties both ends to the same implementation language. Web-style RPC became popular because its simplicity meant it wasn't tied to any particular features of any particular language.


Yes, and simple textual protocols that the browser's limitations made ubiquitous.

I hope that at some point someone standardizes a way to pass virtual stack IDs, exception data and other RPC metadata via HTTP headers so we can get back the sort of features you find in high end RPC stacks.


Do you know of any methods not tied to a particular RPC framework that attempts to solve this problem? I imagine that you could try to implement something of the sort using OpenTelemetry Tracing at least, although the need to (presumably) reconstruct the distributed stack trace from a separate store is a deficiency of this approach for sure.


Not off hand, there's some JSON problems spec I found once but I don't recall if it had anything for stack traces specifically.

It's not hard anyway. You can just upgrade server frameworks to render the stack trace to either a string or a json data structure (or protobuf), and when an exception bubbles up you capture it, convert it to that structure, compress it, possibly encrypt it with a shared key (just to avoid leaks in case there isn't a proxy stripping the special header), base64 encode it and stick it into a special header. Then your HTTP client can be taught to look for that header in case of a 500, decode it, turn it back into an exception and rethrow it.

If you have a unified framework like Micronaut it's probably the work of an afternoon to throw together. The hard part is the crypto. The moment you introduce keys, corporate security teams will insist on things like frequent rotation even if it's just a backstop against badly configured frontend servers and not mission critical (if you leak a stack trace, ok, not great but not the end of the world). So it might be better to obfuscate in a different way that isn't meant to be fully secure, or detect if a request comes from an internal IP in a secure way, etc.


> But Rust has a better workaround to create stack traces: the backtrace module, which allows capturing stack traces that you can then add to the errors you return. The main problem with this approach is that you still have to add the stack trace to each error and also trust library authors to do so.

That's technically true, but the situation is not as dire. Many errors do not need stack traces. That so few carry a backtrace in Rust is mostly a result of the functionality still not being stable [1].

The I think bigger issue is that people largely have given up on stack traces I think, in parts because of async programming. There are more and more programming patterns and libraries where back traces are completely useless. For instance in JavaScript I keep working with dependencies that just come minified or transpiled straight out of npm. In theory node has async stack traces now, but I have yet to see this work through `setTimeout` and friends. It's very common to lose parts of the stack.

Because there are now so many situations where stack traces are unreliable, more and more programmers seemingly do lose trust in them and don't see the value they once provided.

I also see it in parts at Sentry where a shocking number of customers are completely willing to work with just minified stack traces and not set up source maps to make them readable.

[1]: https://github.com/rust-lang/rust/issues/99301


Not sure about node (and I don’t recall it ever being a problem), but chrome supports stack traces through setTimeout just fine.

I’m not sure there are many reputable modules on npm that minify without source maps, and if people aren’t using them I’d consider them to be making a poor technical choice, one that I would correct before contributing to the project.

Diffing two lengthy stack traces to find a divergence is perhaps the fastest way to debug a slew of bug types. Let alone just the ability to instantly click into a file/line even from console prints as you follow the execution path.

And my favorite part is being able to ignore / hide external modules and specific files in chrome’s debugger which allows for stepping through only your code, and evaluating much shorter traces. Something java needed decades ago.

When I do use print debugging I always use console.error to include the expandable stack trace as needed, I can’t imagine how slow it would be to not have that always, and have to resort to stepping and breakpoints to get around.


> I’m not sure there are many reputable modules on npm that minify without source maps, and if people aren’t using them I’d consider them to be making a poor technical choice, one that I would correct before contributing to the project.

React is a good example of a library that is a transpiled mess when installed from npm. Sadly not the only one, there are many more popular libraries that look like this.


Python asyncio supports meaningful stack traces through async functions just fine.

  import asyncio
  
  async def baz():
      await asyncio.sleep(.1)
      raise RuntimeError()
  
  async def bar():
      await asyncio.sleep(.1)
      await baz()
  
  async def foo():
      await asyncio.sleep(.1)
      await bar()
  
  async def main():
      await asyncio.sleep(.1)
      await foo()
  
  if __name__ == "__main__":
      loop = asyncio.new_event_loop()
      asyncio.set_event_loop(loop)
      main_task = loop.create_task(main())
      try:
          loop.run_until_complete(main_task)
      except KeyboardInterrupt:
          main_task.cancel()
          loop.run_until_complete(asyncio.wait([main_task]))
          pass
And then run: $ python3 test_stacktrace.py Traceback (most recent call last): File "/home/user/tmp/test_stacktrace.py", line 24, in <module> loop.run_until_complete(main_task) File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete return future.result() File "/home/user/tmp/test_stacktrace.py", line 17, in main await foo() File "/home/user/tmp/test_stacktrace.py", line 13, in foo await bar() File "/home/user/tmp/test_stacktrace.py", line 9, in bar await baz() File "/home/user/tmp/test_stacktrace.py", line 5, in baz raise RuntimeError() RuntimeError


Note how also in this stack trace you lose the information about where the main task was scheduled [1]. While you can stitch together the await points, it's much harder to find where tasks are originating. This is also true for `TaskGroup` where the actual call that schedules a task is lost. You will just find the eventual await, which might be the task group (which is good, since that would be structural concurrency), but often you just find nothing since it's not properly awaited or in a completely different place (eg: pending shutdown).

[1]: the important line is "main_task = loop.create_task(main())"


That's a good point. I would argue that both that line and the reported `loop.run_until_complete(main_task)` line are important, but then it becomes impossible to have a single linear trace when there are manually scheduled and waited on tasks mixed in.

At that point you have something like a "coroutine frame tree" instead of a "stack trace", where you potentially store multiple parent frames and source lines per coroutine frame. Could be presented something like:

  $ python3 test_stacktrace.py
  Traceback (most recent call last):
  * File "/home/user/tmp/test_stacktrace.py", line 22, in <module>
  |   main_task = loop.create_task(main())
  | * File "/home/user/tmp/test_stacktrace.py", line 24, in <module>
  |/    loop.run_until_complete(main_task)
  * File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
  |   return future.result()
  * File "/home/user/tmp/test_stacktrace.py", line 17, in main
  |   await foo()
  * File "/home/user/tmp/test_stacktrace.py", line 13, in foo
  |   await bar()
  * File "/home/user/tmp/test_stacktrace.py", line 9, in bar
  |   await baz()
  * File "/home/user/tmp/test_stacktrace.py", line 5, in baz
      raise RuntimeError()
  RuntimeError


Maybe in your example, but I long ago gave up on having stack traces with any meaningful async Python code. Is it the frameworks' fault? Presumably. But the end result is the same for me. Which is a shame, because Python stack traces are really good, when they work.


I wish there was a mode to force Errors to automatically capture traces & print them as part of the chain on panic. Would save a lot of time when debugging & let you force libraries into supporting it.


> In theory node has async stack traces now, but I have yet to see this work through `setTimeout` and friends. It's very common to lose parts of the stack.

You need to use the actual `await` syntax to get an async stack trace in node. Callbacks and raw promise work can't be seen by the async stack trace implementation which hooks into `await` points.


One nice side effect of how Rust’s Futures work is that in many cases “normal” stack traces actually reflect the async/await flow accurately. You should see a series of “poll” methods called on each future in the async call chain.


Only until you spawn it into an executor :(


A stack trace (or even better, a minidump with the call stack!) is one of the most useful debugging things for me. Hell, the call stack in general is super useful to me!

I can look at a stack trace, go "oh, function X is misbehaving after being called by function Y, from function Z", and work out what's gone wrong from the context clues, and other debugger info. As a game developer, with codebases that are big, semi-monolithic codebases, it's essential, especially when code crosses the gameplay/engine and engine/kernel barriers.


Great point. I've found it's very often possible to understand and fix problems "one shot" from stack traces alone — and we're talking production builds here... So I wouldn't turn them off, (an idea mentioned in the article), unless profiling shows that they are one of the last things preventing the code from reaching the target performance.


>Are they just not used to having them so that they don't miss them?

The languages that I work in that don't print useful traces are typically strongishly-typed system languages. So I miss them - sometimes having to step through offending lines of code in a debugger - but I also completely avoid a whole class of bugs that are responsible for most of my stack traces in Python.

TFA's example isn't one of these, but is a function that would have a return code checked and logged if erroneous. This class of bug also can't be inlined and makes an easy breakpoint-ee.


They are useful sure, and I print a stacktrace on any type of error/exception, but often breaking into the debugger is even more useful and faster as you can see local variables, program state, and what other threads happen to be doing.


Hard to break into the debugger for a production application running on hundreds of servers.


One can argue whether stack traces should be enabled for production (at least on all servers) given they're relatively expensive to create. Which isn't a problem if they're exceptional, but in a lot of cases they aren't.


They aren’t equally expensive everywhere. They are cheap on the JVM.


Perhaps, but remote debugging is a thing, though triggering an auto break into debugger would be more complex.


Remote debugging is a thing but not very practical when you have hundreds of instances of your application running across dozens/hundreds of servers.

For smaller scale apps it's a godsend but I haven't worked in that environment in more than a decade so remote debugging is essentially useless in my work.


I am a big fan of Lisp SBCL stack traces; even in complex projects I never saw before, I'm almost always able to read, interpret and fix the issue just from that.


Oh yes. I haven't found any equivalent yet.


Kinda related, but I feel it would be useful for log entries to include file/lineno and/or some unique identifier. Helps both pinpointing where some weird message comes from, and for searching for specific entries in the logs.

Sure, you can grep the log message but it can be difficult if it has some templating/formatting going on, and it can be pretty easy to end up with non-unique messages.


Whats weird is how expensive this can be - i.e. to do it in Go requires invoking runtime reflection, whereas technically the compiler should be able to update the final numbers into the messages at build time.


I don't know if it is part of the reason but stack traces can be considered a vulnerability in some situations.

Also, for "normal" errors, you shouldn't need a stack trace. For example, "file not found" is, from the point of view of the developer an expected situation and should be handled with the same amount of care as it the file was present. You don't dump the internals to the user when you have successfully opened the file, so don't dump them when you haven't.

For unexpected errors (i.e. bugs), the crash, abort, panic, or whatever it is called in your language. These will usually give you a stack trace, or a core dump from where you can extract the stack trace and more.

What I would wish for however would be a standard feature in languages to display a stack trace on command. Many languages have it, but even when they do, they could be more prominent. This way, if you encounter an unexpected situation you want to debug without crashing and without a debugger attached, you can call it.


Stack traces are very valuable. Sometimes it can even help to attach them to some object creation, when you later wonder why/how/where this object was created. E.g. in TensorFlow, every single Tensor had a traceback attached to it, so when there was any error later on, it would show you where it was created. This is maybe less needed now with eager mode, but you might have other similar situations.

One problem with stack traces is maybe that they can be too verbose. E.g. if you print them for any warning you print to log (or stdout). Sometimes they will be extremely helpful for debugging some problem, but in many cases, you maybe don't need them (you know why you get the warning and/or you don't care about it).

You could also add more information to the stack trace such as local variables. That can be even more helpful for debugging then, but again adds more verbosity.

For example, we often use this to add information about relevant local variables: https://github.com/albertz/py_better_exchook

One solution to the problem with verbosity is when you have foldable text output. Then the stack trace is folded away (not shown in all details) and you can unfold it to see the details. See the DomTerm demo here: https://github.com/albertz/py_better_exchook#domterm

Some more on text folding:

https://github.com/PerBothner/DomTerm/issues/54

https://gitlab.com/gnachman/iterm2/-/issues/4950

https://github.com/xtermjs/xterm.js/issues/1875

https://gitlab.freedesktop.org/terminal-wg/specifications/-/...

https://github.com/vercel/hyper/issues/1093


My main problem with (jvm) stack traces is that they generally don't include information about the values that are passed to the function calls, so you get the structure of the code, but not the actual value that could help you reproduce the error. I know that once you deal with relatively complex objects that are not trivially serializable you get an address, which is not super useful, but in some codebases / problem areas you still could be getting a lot of information that gets lost due to that design decision.


I have been an avid proponent of the way errors are managed in Rust and Go for a long time. However, this article raises a very good point. Before i started developing in Rust and Go, i did Java and python for several years. And damn, do i miss those stacktraces every now and then when something bad happens that isn't properly handled by the code.

Still, i do think returning the error as a return value is better than having a completely separate flow when dealing with exceptions. I like that it forces me to properly deal with an error and not just ignore it and think something like "meh, i'll get to this later". Because i will never "get to it later".


You could combine both by adding a stack frame each time the error is returned one level up. This could be done explicitly (cumbersome and not everyone will do it) or automatically by the language (weird magic, but useful).


In C# the quasi mandatory async/await for everything has many downsides, particularly for debugging. It breaks all stack traces. It also makes it impossible to pause the code.


Ha? Can you give an example? I've seen lots of perfectly good stack traces in async code - no problems at all. Pausing code also works, at least using vs or rider.


hiding stack traces is a bad practice and should be avoided unless you're in the last layer of the application (i.e. presentation to the user).


Stack traces are underrated, unless you're developing EnterpriseJavaSingletonFactoryAbstractionFactoryFactories, in which case they're buffer overflows on your poor log analyzer


Stupid clickbait headline for a famished article


Way before I consistently used step debuggers and would just "print-debug" println("why are you here?") or "raise-debug" raise new Error("huh?"), I tinkered with a step debugger, but found it too complex and hard. But I remember that it also allowed me to move backwards in the stack.

It allowed me to go some frames back - lines up, up in the stack. I don't recall the name of this debugger, nor what language it was. But I've never since seen this, yet very often wished I had it (for rust, javascript, python, mostly).

Did I misremember? Can such a thing exist? Does it exist?


Time travel debugging is the category, but I can’t help you much more than that with the tool names.


While time-travel debugging would provide such capability, it is highly unlikely they are describing that due to timeframe and unfamiliarity with debuggers.

They are most likely describing the much simpler feature of having the debugger drop stack frames (by force nuking the stack) and then starting over (with all the other side effects that occurred in the “prior” execution still present). This is a fairly common feature for exploratory debugging, but has the obvious downsides of leaving lingering side effects so is only fit for use in non-production environments at best like other “edit-and-continue” features.


Time-travel debugging is something different – time-travel debugging means you can actually step backwards through the execution to try and see how you ended up with the bug.

Merely being able to inspect the state of (local) variables further up the stack frame is a much more limited proposition, even if it can still be useful.

> yet very often wished I had it (for […] javascript […] mostly)

Both Firefox's and Chrome/Edge's devtools allow you to do that, don't they? Click on an entry in the stack frame and it takes to the corresponding code line and shows you the state of the variables relevant at that point.


VB6 had it, and iirc you could even edit the code after stepping back to step forward on a different path. While trying to confirm that with a quick google I see that visual studio added stepping back in 2017. Though not sure it supports editing inline.


I distinctly remember C# having a similar capability many years ago too.

Of course, this was in the hyper expensive Visual Studio Enterprise IDE, not something like VS Code, which high school me totally paid 4 grand for and didn't pirate at all.

These days, features like these are making it to free or cheap tools.


In Java-world it's very common.


Don't worry. Just wait until you start doing async stuff, especially with React.

You'll be dreaming of good old times of linear stacktraces.


Windows 95 introduces threads as a revolution for developer productivity, and you had no longer write async Windows event loops which were hard to debug. Linear stack traces are one of the main selling point, among others.


Node 16 was supposed to make this situation much better but you sure could have fooled me. Is there less salt in my wounds? Sure.


Depends on the audience. As a user I'd rather see "can't load data: failed to parse header: wrong number of elements" than a stack trace with WrongNumbersOfElementsException at the tail.


Stack traces are a feature for developers to locate and fix bugs easily, and should not be a feature for end users.


While you're debugging using AI (specifically, ChatGPT o1), you can benefit from copying stack traces. It debugs better than if you just describe what's wrong.

Another tip: I have found that it is helpful to ask AI to "deeply analyze" (use those words) and think about the problem without providing a solution (say "don't reply with any code"). If you don't do that, it will take its first guess and then eagerly start outputing code that is still wrong and doesn't really identify or fix the issue. When you ask it to deeply analyze what's wrong and not reply with any code, it frequently finds the true underling problem, and then you can ask for how to solve it in the next step.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: