You might want to take a look at the Binary Trees benchmark from the Benchmarks Game. It's a good stress test for garbage collectors, especially when implemented with multiple threads.
The benchmark creates many binary trees of varying sizes — some short-lived and some long-lived — which puts memory pressure on the GC in a realistic way.
You can easily adapt the code to different languages, and even tweak the memory pressure or concurrency to test GC behavior under various conditions.