This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Download Microsoft Edge
More info about Internet Explorer and Microsoft Edge
Applies to:
.NET Core 2.1, .NET Core 3.1, .NET 5
This article discusses how to use the createdump tool to capture .NET Core crash dump files in Linux, and then use lldb to diagnose the crash problem.
Prerequisites
The minimum requirement to follow these troubleshooting labs is to have an ASP.NET Core application to demonstrates low-CPU and high-CPU performance problems.
You can find several sample applications to achieve this goal on the internet. For example, you can download and set up
Microsoft's simple webapi sample
to demonstrate undesirable behavior. Or, you can use BuggyAmb ASP.NET Core application as the sample project.
If you have followed the previous parts of this series, you should have the following setup ready to go:
Nginx is configured to host two websites:
The first listens for requests by using the
myfirstwebsite
host header (
http://myfirstwebsite
) and routing requests to the demo ASP.NET Core application that listens on port 5000.
The second listens for requests by using
buggyamb
host header (
http://buggyamb
) and routing requests to the second ASP.NET Core sample buggy application that listens on port 5001.
Both ASP.NET Core applications should be running as services that restart automatically when the server is restarted or the application stops responding.
The Linux local firewall is enabled and configured to allow SSH and HTTP traffic.
If your setup isn't ready, go to "
Part 2 Create and run ASP.NET Core apps
."
To continue this lab, you must have at least one problematic ASP.NET Core web application that's running behind Nginx.
Goal of this lab
Automatically generated core dump files aren't useful because they don't contain all the managed state information. The recommended tool to capture .NET Core core crash dump files is createdump.
In this part, you'll learn how to capture a crash dump file by using createdump, and open the file in lldb to diagnose the crash problem.
Createdump is installed automatically together with every .NET Core runtime.
As explained in the
createdump configuration policy
documentation, you can set configuration options that have environment variables. These are passed to the createdump command as parameters. Here are the environment variables that are supported:
COMPlus_DbgEnableMiniDump
: If set to
1
, enables automatic core dump generation upon termination. Default is
0
.
COMPlus_DbgMiniDumpType
: This is the type of the mini dump file that will be created. The default value for this is
2
(or, an enum type of
MiniDumpWithPrivateReadWriteMemory
). This means that the dump file that's generated will include the GC heaps and the necessary information to capture stack traces for all existing threads in a process.
COMPlus_DbgMiniDumpName
: If set, use as the template to create the dump file path and file name. The PID can be put into the name by using the
%d
parameter. The default template is
/tmp/coredump.%d
. By using this environment variable, you can configure the output directory.
COMPlus_CreateDumpDiagnostics
: If set to
1
, enables the createdump tool diagnostic messages (TRACE macro). This setting might be useful if createdump doesn't work as expected and doesn't generate a memory dump file.
You can find details about these variables in
createdump configuration policy
.
The important variable here's
COMPlus_DbgEnableMiniDump
. You have to set this environment variable to
1
. There are several methods to set this environment:
Set it in your application's configuration file.
Use the
export COMPlus_DbgEnableMiniDump=1
command to set it. This setting won't persist after an operating system restart. Therefore, you have to set it as persistent if you want to keep the setting enabled after a restart.
Set it in the ASP.NET Core service unit file.
Setting this variable in ASP.NET Core service unit file is the easiest method. The drawback is that the service should be restarted. In this troubleshooting section, this will be the option that is demonstrated.
Open the buggy application's service file, and add the
COMPlus_DbgEnableMiniDump=1
environment variable. That's the same as you have done several times in previous chapters of this training.
After you make these changes, reproduce the crash problem. If createdump works, the dump file should be written under
/tmp/
directory as
coredump.<PID>
. Follow the same steps to reproduce the problem:
Select
Crash 3
. The page loads correctly but returns a misguiding message that suggests that the process should have crashed.
Select
Slow
. This will generate an "HTTP 502" response code (bad gateway error) instead of the product table.
After the problem occurs, none of the pages will render, and you'll receive the same error message for 10-15 seconds.
After 10-15 seconds, the application starts working correctly.
You should now have a core dump file in the
/tmp
directory.
If you don't have a core dump file, make sure that you configured the
buggyamb.service
file correctly. You also have to reload the service configuration and restart the service.
Open the core dump file in lldb
We recommend that you move the dump file to your
~/dumps/
folder to follow along with the sample analysis. To open the dump file, run
lldb --core ~/dumps/coredump.<10354>
. In this command, replace the
10354
placeholder with the PID of your process.
If you have previously opened a dump file and worked with lldb, you have already set up symbols and installed SOS. You can open the same .NET Core version dump file without having to download the symbols again. However, if you open a different .NET Core version dump file for which the symbols aren't yet downloaded, you'll have to download the symbols for that version before you can start the analysis.
Run the SOS
clrstack
command to display the managed call stack. Remember that you were seeing an error when you ran the same command by using a core dump file that was generated by system. This time, you should see the correct managed call stack.
This is a good start. However, the call stack that's displayed belongs to the main thread of our debugged process. It's not the thread where the exception is thrown.
If we open a crash dump file in WinDbg on Windows, WinDbg would directly select the thread that caused the crash. However, this isn't the case in lldb. In lldb, WinDbg doesn't automatically select the thread that triggered the debugger to generate the memory dump.
Although this WinDbg behavior is useful when you're debugging, the lack of this feature in lldb isn't the end of the world. Instead, you can examine all the threads to try to determine where the exception could be thrown. Start by examining the native threads by using the
thread list
command.
It's always a good idea to start by running a quick inspection of all the thread calls stacks so that you can understand what was running at the time that the dump file was generated. Look first at the native thread list that has the
thread list
command.
The asterisk (*) near the first thread in the list (
thread #1
) indicates that it's the active thread.
Examine that thread's call stack. To do this, you must first select the thread in question. In the memory dump analysis that you'll be running, the thread number most likely will be different. To select another thread as the active thread, use the
thread select
command, and pass the lldb dbg thread ID. For example, run
thread select 15
to switch to thread 15. Then, every successive command that you run will be in that thread's context. To see the native call stack, run the
bt
(back trace) command.
As you can see in this screenshot, this thread is certainly the thread that triggered the crash.
PROCEndProcess
and
PROCAbort()
are called after an unhandled exception.
POCCreateCrashDump
tells us that a crash dump is written by .NET Core.
You can examine the managed call stack by running the
clrstack
command. However, this won't reveal much. Run the
pe
command to get the exception details.
This information indicates: A
System.Net.HttpWebRequest
is triggered in your Crash3 page in the
LogTheRequest()
method. This is an important piece of information to help locate the problem. But what if you want to find the URL of the HTTP request? To proceed, try to inspect the objects that are referenced on the stack to see whether you can gather more information from this list. To display all managed objects that are found within the bounds of the current stack, run
dso
.
This isn't helpful. You should not see any
System.Net.HttpWebRequest
instances. There are instances of the exception, and you've already inspected it. Therefore, this command didn't yield new information that's related to the cause.
All the managed objects are stored in a managed heap, and we can look at the managed heap by running
dumpheap
. Don't run
dumpheap
without any parameter because then the command will list all the objects inside the managed heap (a large list). Instead, you can get the statistics of the heap by using the
dumpheap -stat
command.
You can use one more tactic to narrow down the statistics by running the command in the following format:
dumpheap -stat -type System.Net.HttpWebRequest
The following screenshot displays the statistics for the managed objects, which contain the string
System.Net.HttpWebRequest
in their name.
In the sample application, there is only one
System.Net.HttpWebRequest
object on the managed heap. In the previous list, the address that's seen next to the
HttpWebRequest
entry isn't that object's address in memory. Rather, it's the address that corresponds to the "method table" of objects of type
System.Net.HttpWebRequest
. To get the actual list of the objects, you can pass that method table (MT) address to the
dumpheap
command in the following manner:
dumpheap -mt <address>
For example, run
dumpheap -mt 00007f53623cb640
to find the object's address.
Now, you're able to identify the address of the problematic object. In this example, it's
00007f51300c0868
. You can investigate the object's properties by passing that address to the
dumpobj
command. This will list the properties of that object. In this example, run
dumpobj 00007f51300c0868
to examine the object's properties.
You're investigating a
System.Net.HttpWebRequest
object, and that one of its properties is
_requestUri
. This is an object of the
System.Uri
type. You want to determine the URI. Therefore, pass the address of the
_requestUri
property to the
dumpobj
command
Copy the address of the
System.Uri
object, and investigate it by using
dumpobj
again. Run
dumpobj 00007f51300bfbb8
. The address of the object in the memory dump file that you generated will most certainly be different. The list will display the
_string
property of
System.Uri
.
Finally you're able to find the URL of the HttpWebRequest:
http://buggyamb/Problem/Api/NotExistingLoggingApi
. As the name suggests, this is probably not an existing page within the application.
To conclude, the theory about how the crash occurred is as follows:
An HttpWebRequest is made to a non-existing URL in the
LogTheRequest()
method in the
Crash3
webpage.
In a real-world application, the solution to fix this issue would be to handle the errors when
HttpWebRequest
is made. However, in this case, the solution is much simpler: Don't make an
HttpWebRequest
request to a non-existing page.
At this point, you should probably have more questions about what caused the crash. For example, why was the crash triggered after you selected the
Slow
link?
Feel free to continue investigation by yourself. The next suggested step you could take would be to run the
gcroot
command by using the
HttpWebRequest
object address to find out where it is rooted. This might help you develop a picture of how the crash occurred.
This concludes the lab. Press
Ctrl+C
or use the
q
command to quit the lldb debugger.
Next steps
Lab 2.1 Troubleshooting performance problems by using createdump in Linux