Introduction
Data is playing a key role in today's applications.
Since it is not always feasible to test a data-intensive application with real data, testing becomes a time-consuming process.
Hence, it is necessary to have a mechanism for programmatically generating fake data.
In this article, I will show you how to use Bogus, a library that has the purpose of generating fake data for testing.
Key features and benefits
Data Population: Bogus provides convenient methods to effortlessly populate objects and collections with generated data, making it particularly valuable for tasks like populating databases or creating extensive datasets for performance testing.
Realistic Data: With deliberate precision, Bogus crafts data that closely mirrors real-world information, showcasing typical patterns, distributions, and variances found in actual data. This ensures the generation of more accurate test scenarios and sample datasets.
Diverse Data Generation: Bogus offers an array of data generators capable of creating fabricated data for common categories such as names, addresses, phone numbers, email addresses, dates, numbers, and more. Moreover, it empowers users to develop new data types through the use of a fluent API extension.
Localization Support: Bogus facilitates data generation in multiple languages, making it ideal for testing internationalization or multi-lingual applications.
Customization Options: The flexibility provided by Bogus allows users to tailor the generated data to specific criteria. This includes manipulating the structure, limitations, and arrangements of the produced data, ensuring compatibility with the application's requirements.
Integration with Entity Framework Core: Bogus seamlessly integrates with Entity Framework Core, streamlining the process of producing fake data for database entities. This integration enables an easy population of databases with realistic test data.
The Problem
Consider the following example:
public interface IUserRepository
{
Task<User> GetUserAsync(string username);
}
public record User
{
public string Username { get; set; } = default!;
public string Password { get; set; } = default!;
public UserProfile Profile { get; set; } = default!;
}
public record UserProfile
{
public string FirstName { get; set; } = default!;
public string LastName { get; set; } = default!;
public Address Address { get; set; } = default!;
}
public record Address
{
public string Street { get; set; } = default!;
public string City { get; set; } = default!;
public string ZipCode { get; set; } = default!;
}
We want to test that the GetUserAsync method returns the whole User with all the nested classes. Let's write a test for it.
[Fact]
public async Task GetUserAsync_ReturnsCompleteUser()
{
var user = new User()
{
Username = "apd99",
Password = "FSXll223!2kjh",
Profile = new()
{
FirstName = "Alexandru",
LastName = "Prodan",
Address = new()
{
Street = "123 Main St",
City = "Fakeville",
ZipCode = "54321"
}
}
};
_context.Users.Add(user);
_context.SaveChanges();
var retrievedUser = await repository.GetUserAsync(user.Username);
retrievedUser.Should().BeEquivalentTo(employee, options => options.ComparingByMembers<User>())
}
That would have taken a long time to think up and write down, wouldn't it? Consider doing this each time you require a complete User object.
Imagine scripting this to populate a database with ten distinct users!
Using Bogus Fakers
Bogus, thankfully, makes this procedure simple. We may define Fakers classes, which are effectively instructions for creating dummy objects.
But first and foremost, we must install Bogus.
dotnet add package Bogus
Having completed the previous task, we can now proceed to establish Fakers for our User class.
public class AddressFaker : Faker<Address>
{
public AddressFaker()
{
RuleFor(addr => addr.Street, f => f.Address.StreetName());
RuleFor(addr => addr.City, f => f.Address.City());
RuleFor(addr => addr.ZipCode, f => f.Address.ZipCode());
}
}
public class UserProfileFaker : Faker<UserProfile>
{
public UserProfileFaker()
{
RuleFor(profile => profile.FirstName, f => f.Name.FirstName());
RuleFor(profile => profile.LastName, f => f.Name.LastName());
RuleFor(profile => profile.Address, f => new AddressFaker().Generate());
}
}
public class UserFaker : Faker<User>
{
public UserFaker()
{
RuleFor(user => user.Password, f => f.Internet.Password());
RuleFor(user => user.Profile, f => new UserProfileFaker().Generate());
RuleFor(user => user.Username, (f, user) => f.Internet.UserName(user.Profile.FirstName, user.Profile.LastName));
}
}
Bogus employs a sophisticated and elegant API to specify Fakers.
There is a diverse assortment of fake value generators available across several areas, as well as numerous community-based ones.
Now, let us modify our test to utilize the Faker instance we have just generated.
[Fact]
public async Task GetUserAsync_ReturnsCompleteUser()
{
var user = new UserFaker().Generate();
_context.Users.Add(user);
_context.SaveChanges();
var retrievedUser = await repository.GetUserAsync(user.Username);
retrievedUser.Should().BeEquivalentTo(employee, options => options.ComparingByMembers<User>())
}
It's a considerable improvement, isn't it?
Locales support
With Bogus, we can generate data in more than 45 languages.
For instance, here’s an example of Bogus creating a random address in Romanian:
var address = new Bogus.DataSets.Address(locale: "ro");
Console.WriteLine(address.FullAddress());
// Output: Aleea Dobrun, Bloc 27, Ap. 408, Brăila, Timor-Leste
Conclusion
In conclusion, Bogus proves to be an invaluable tool for efficiently generating synthetic data in the realm of software testing.
With its rich feature set, including realistic data patterns, localization support, and seamless integration with technologies like Entity Framework Core, Bogus simplifies the process of testing data-intensive applications.
By offering customization options and diverse data generators, it enables developers to create precise test scenarios and populate databases with authentic test data.
Embracing Bogus empowers developers to streamline the testing of complex scenarios, ensuring the robustness and reliability of their applications in a data-driven environment.