CMB221 - Windows Internals, Troubleshooting, and Memory Dump Analysis

Learn how current Windows operating systems are designed and implemented, and immediately apply that knowledge to isolate the causes of system and application failures and performance problems. 

Level: Intermediate
Audience: Applications developers; systems software developers; device driver developers; system administrators; system integrators; hardware OEMs; I.T. support personnel
Description:

This is one of our most popular seminars. It combines the troubleshooting and in-depth debugging information from Windows Debugging, Troubleshooting, and Memory Dump Analysis (DBG211) with the necessary internals information from Core Windows Internals (INT201). In short, we’ve merged the “how it all works” material of the latter with the troubleshooting and debugging information and lab problems from the former, creating a single, integrated presentation where each set of material is used to help you learn the other.

We give you a comprehensive “guided tour” of the internal design and implementation of Windows operating systems and, for each key feature, show you how to observe it working, measure it, and optimize it… and when it isn’t working, you’ll know how to find out why, and how to fix it.

A significant portion of the time will be spent on system memory dump (“blue screen”) analysis using the Windows Debugging Tools (WinDbg). You’ll learn how to correctly isolate problems to the component level (not just pointing at the latest non-Microsoft thing on the stack) and sometimes deeper. We present a number of debugging strategies that are useful for most scenarios. We have you analyze a number of example memory dump files, some from deliberately “bugged” systems, some from crashes experienced in real systems in the field, which together give you practice in applying the strategies and in recognizing which will be effective.

Of course, not every memory dump can be “solved.” But a week spent in this seminar will vastly reduce your time-to-analyze while improving your success rate. You’ll also, of course, be able to apply the Windows internals material to many other problems besides crash analysis.

Topics:
  • System architecture overview
  • Introduction to tools
  • Program execution environment
  • Kernel mode components: Executive, kernel, and HAL
  • Environment subsystems and user-to-kernel call implementation
  • Supporting the Windows GUI; understanding "GUI hangs"
  • Handles, objects, and security
  • Kernel mode execution contexts and environment
  • Kernel mode stacks
  • Paged and nonpaged pools
  • Interrupt Request Levels (IRQLs)
  • Scheduling and waiting; multiprocessor/hyperthreading issues
  • Identifying CPU-bound tasks
  • User mode memory management
  • User mode heaps
  • Virtual memory implementation
  • Paging
  • Working set management
  • Physical memory management
  • Virtual and physical memory leaks
  • I/O subsystem and device driver architectures
  • File system cache
  • Types of system failures
  • Analyzing system failures and "hangs"
  • Interpreting stop codes
  • Understanding stack traces and disassembly code
  • Identifying problem components in system crashes
  • Interpreting call sequences
  • "Live" kernel and user mode debugging
Prerequisites: Experience using, administering, or writing applications for Windows operating systems.
Operating systems supported: This seminar primarily addresses Windows 7 through Windows 10 and Windows Server 2012 R2. Most of the material is applicable to earlier versions of Windows. Earlier versions can be specifically addressed upon request.
Durations and formats: 5 days with labs
4 days lecture only
Labs: We strongly recommend the hands-on labs version of this seminar. As in all of our seminars, we have carefully designed a series of demonstrations, lab exercises, and problems that illustrate and build on the information presented. Every key operating system principle is illustrated and demonstrated by one or more “hand on” exercises. These exercises will also help you gain familiarity with the various tools and utilities. Specific tools and utilities to be covered include: Built-in Windows tools such as Task Manager, Resource Monitor, Performance Monitor, Event Viewer, Registry Editor, Group Policty Editor, and BCDedit; SysInternals tools including Process Explorer, Process Monitor, VMmap, RAMmap, and testlimit; the WIndows Debugging Tools (both for examining the live running system and for analyzing memory dumps); and the Windows Performance Toolkit. For the dump analysis labs, we have a variety of problem scenarios – some involving deliberately created failures; others from real systems with actual bugs in real, shipping drivers – each designed or selected to illustrate the use and applicability of a particular analysis technique and operating system principle. After each lab period, we will lead a walkthrough and discussion of at least one approach to the problem given. After the seminar, we will also provide you with a document that gives a detailed walkthrough of the analysis procedures for each problem scenario, and copies of the corresponding example memory dump files, for your further study. Specific types of problems to be covered in the dump analysis labs include:
  • Typical “bad pointer”
  • Buffer overrun
  • Page fault in what should have been nonpageable code and data
  • Kernel memory corruption
  • Actual (probable) memory error – a dropped bit
  • "Assertion" bugchecks such as PROCESS_HAS_LOCKED_PAGES and the DPC watchdog timer
  • Kernel stack overflow (rare, but it illustrates several important concepts)
Additional information:

Short formats

There is no short form of this seminar. If you are interested in a "fast-track" approach to these topics, we suggest one of our one-day internals seminars (INT150, INT151, or DRV150), followed by the one-day version of DBG211.