Proposal: Overhaul CLI Testing

Some optional reading:
* [Jest: Snapshot Testing](https://jestjs.io/docs/snapshot-testing)
* [Snapshot testing in practice: Benefits and Drawbacks](https://www.sciencedirect.com/science/article/abs/pii/S0164121223001929) (Lit review on using snapshots)
* [Nightwatch: Visual Regression Testing](https://nightwatchjs.org/guide/writing-tests/visual-regression-testing.html) (Not shilling Nightwatch or anything, they just had the most succinct guide and example on what visual regression testing is)

---
**TL;DR**
I believe we should introduce less-intensive checks in the CLI functional test folder, and possibly introduce component-level snapshot testing and visual regression testing. This is so that test suites are more relevant and maintainable

**THE ISSUE**
Currently, we have an open issue to explore migrating the CLI functional tests to Jest (#771). Since this issue is in our scope, I believe it's a good time to bring up possible improvements we can make to the testing suite.

### What are we currently doing and why?
The CLI functional tests build an example site and then compare with a pre-defined directory containing expected files. It ensures that the built output is 1:1 with expected output.

The functional tests in the CLI package are a good way to do an E2E tests on Markbind as a whole, since it touches on basically all the components. Tests like these are relatively easy to add to, and it prevents unintended changes in DOM structure from going through.

However, fragility is a problem. One change somewhere can propagate and make the expected files out of date. Diffs generated can be huge for a generally small change. Intuitively, developers are more likely to accurately review smaller diffs (don't quote me).

On top of that, much of our implementation is in-house, meaning more code to maintain, AND, we don't get coverage since Jest isn't currently running the tests.

### How does Snapshot Testing compare with what we're doing?
Snapshot Testing is a similar concept - it compares output with a given "golden master" (ie expected file)  that is updated when necessary. However, in practice snapshot testing is intended to be done at the component-level, and not intended for whole site output.

The reason is simple - this causes snapshots to be small and **focused**. Making a change in a component should only require updating its related snapshot, not a whole unrelated snapshot somewhere else with no context.

It's not ideal for us to simply change to using Jest with snapshots; Jest generally only outputs a snapshot to 1 file per test suite, meaning that all related changes go in that one file and that one file will be huge. We should [treat snapshots as code](https://jestjs.io/docs/snapshot-testing#1-treat-snapshots-as-code) - you probably wouldn't edit a file that is 40k lines long.

### Proposal: Remove intensive checking in CLI functional tests and instead do Snapshot Testing at the component-level
I believe we should remove checking between expected/output file content in the functional test suite. Large snapshot files are a key weakness of Snapshot Testing, and that is essentially what we're doing. This incentivizes developers to simply run the update script to overwrite the snapshots. TEAMMATES [encountered this problem](https://github.com/MarkBind/markbind/issues/761#issuecomment-472759620).

Instead, we should have component-level snapshot tests (e.g. a snapshot for card stacks, includes, etc.) if we still want to do DOM checks. These tests additionally should not be in the CLI package.

### Proposal: Reduce granularity of checks in functional tests
The functional testing being done is still required and very much needed, but I propose that instead of verifying the actual contents of the site, we verify the built *structure*. That is, we maintain snapshots of the directory tree of the built site to ensure that files are actually being output.

This allows us to still test that the CLI's commands run without failure, while keeping snapshots small. We would only need to update the snapshot if extra pages are added or old pages are deleted. (Or if markbind changes how it structures its contents)

As a comparison, [facebook/docusaurus](https://github.com/facebook/docusaurus)'s end-to-end testing suite is quite minimal. It runs a command to create a docusaurus site, and as long as the command doesn't exit with a non-zero code it passes. (see [here](https://github.com/facebook/docusaurus/blob/main/.github/workflows/tests-e2e.yml))

### Proposal: Introduce Visual Regression Testing
The functional tests also are currently a good way to ensure that components are being written into the DOM tree.

To substitute this, I think visual regression testing would be good. This way, we can create snapshots of the rendered HTML, ensuring that components are present and look as expected.

There is the possibility that we run into the same issue of snapshots being fragile, but I believe this can be mitigated by writing atomic test cases such that one change shouldn't affect more than one snapshot (we could also have one specific all-in-one snapshot)

There is a well-maintained jest matcher that facilitates visual regression testing, [jest-image-matcher](https://github.com/americanexpress/jest-image-snapshot)

### Closing
Overall, I believe if we follow the 3 proposals, we can create tests that are less of a pain to maintain while still providing similar value. Let me know your thoughts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Overhaul CLI Testing #2831

What are we currently doing and why?

How does Snapshot Testing compare with what we're doing?

Proposal: Remove intensive checking in CLI functional tests and instead do Snapshot Testing at the component-level

Proposal: Reduce granularity of checks in functional tests

Proposal: Introduce Visual Regression Testing

Closing

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Overhaul CLI Testing #2831

Description

What are we currently doing and why?

How does Snapshot Testing compare with what we're doing?

Proposal: Remove intensive checking in CLI functional tests and instead do Snapshot Testing at the component-level

Proposal: Reduce granularity of checks in functional tests

Proposal: Introduce Visual Regression Testing

Closing

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions