Canvas App Monitor

 Canvas App Monitor

Reference

Canvas App Monitor is a new tool to better understand, debug, and improve your apps.  Monitor provides a running log of all the activity within your app allowing you to see how declarative formulas are actually doing their work.

This first version is focused on network activity similar to a network trace in the browser.  You can drill into details of each network call including how long they took, how much data was returned, and if there was an error.  Also provide context for the request by mapping to the formula that made it.

NOTE: At this time, only network traffic for tabular data sources are included in the Monitor, such as Common Data Service, SharePoint, and SQL Server.  There will be no activities shown for non-tabular connectors, custom connectors, and Flow.  For example calls to the the Office 365 Users connector will not show any activity.  We understand that these other sources are very important too and we are actively working on adding them.  This blog will be updated when they are also included.

Basic operation

Let’s take a look.  On the left hand side of Canvas Studio there is a new tool icon that opens up a tray of Advanced tools:

Monitor is listed as an experimental feature.   Although it is experimental, there is no need to enable it in the File menu’s advanced settings like other experimental features since it has no impact on your app unless you turn it on and then only for that Studio session.  You can use Monitor with any app in a test environment or in production.

Selecting Open monitor opens up a new browser tab and connects to the existing Studio session.  It immediately starts recording and displaying all the network activity in your app:

Note that we don’t start recording the log until you open the Monitor, and the Monitor can only be opened from Studio.  So there is no way to obtain a log for a production app, something we may support in the future.

You can drill into the details of an event by selecting a row.  Details provides a high level overview of the event.  To help keep the display manageable, portions of the JSON may be collapsed, as is “data” in this example:

Formula provides context on which formula in your app caused the event.  The name of the control property is displayed both in the table and at the top of this tab.  At this time, we can’t isolate the exact function call within a large formula, but this is one of many enhancements we have planned.  Note that the operation is listed as createRow which is not a formula language function; this is an internal operation name corresponding to the Patch function.  In general, tighter mapping to what is in your formulas is on our backlog.

Request is the actual network request that was made.

And Response is what was returned.  There are two ways to look at the response, first in a tabular format where tables and records are shown in a grid:

But you can also display the response in JSON format using the drop down control on the right of the screen above the data:

Finding issues

In this ecample, I’m using the Northwind Sample Data app that is included with the Northwind sample solution.  This is a Canvas app that pumps sample data into Common Data Service:

An earlier version of this app had poor performance and would randomly generate errors with no explanation or apparent pattern.  Sometimes it failed and sometimes it worked fine.

To debug this app, the first step was to understand the errors.  Here’s one of them as seen in Monitor, with the key error message highlighted at the bottom right.  This and other requests are returning 429 HTTP status codes, which indicates too many requests, with the specific error message “Rate limit is exceeded.”  At present, this error message does not make it into the app.  But in Monitor it is easy to see:

What does this mean?  Like many services, Common Data Service will throttle requests if there are too many requests or they are coming in too rapidly.

So are requests coming in too rapidly?  Let’s look at that Monitor log again.  For each createRow call we are generating a lot of getRows requests from the ProgressCount.Text property, each to a different entity and not to the entity we are creating rows in.  Let’s look at the formula:

This formula is for the count of rows shown in the Records column to the right in the gallery.  It appears that for each record added, this formula is being reevaluated and CountRows is being called on many of the entities, resulting in getRows in the log since CountRows isn’t delegated for CDS.   For each 1 request to add a record, we are potentially making 12 additional requests.

To test this hypothesis, comment out the formula and replace it with a static text string.   Sure enough, the number of requests comes way down, errors go away, and performance is greatly improved.   Not only were these extra requests sporadically causing the errors due to hitting the CDS request throttle but it also explained the performance problem.

The permanent fix for this app was to do the CountRows manually for each entity as records were being created in it.   The animated Monitor log shown earlier is of that improved app and there are no getRows operations after each createRow, instead they are all done after the rows have been added.

Declarative insights

Canvas apps are declarative.  As a maker, you describe “what” you want the app to do without needing to say “how” it is to be done.   In this example, the formula for ProgressCount.Text described succinctly what was to be displayed.  The maker didn’t need to say when the information was to be refreshed and that is a great thing.  The original app was written correctly.

Because the maker doesn’t specify “when” the system must determine when the information is to be refreshed.  By doing analysis across the entire app, the system usually does a great job of optimizing this, far better than most makers would on their own.  But there are cases where the system will be too careful and do more work than is needed, with performance implications and in this case functional errors due to throttling.  The system can be smarter and only update the count for the entity with the added record, and in fact since I started showing this demo internally the team has made it increasingly hard for me to recreate the errors.

The work around is too be less declarative, take some control back from the system, and do the work manually using imperative means, App.OnStart and Screen.OnVisible events, and caching in state variables.  Most performance optimizations in Canvas apps are based on doing this.  Use this sparingly and with care as imperative logic is often harder to understand and maintain.

The key to figuring all this out is a better understanding what is actually happening in your app.  There often isn’t enough information when looking at the app statically or even when investigating runtime errors.  You need to watch the machine in operation and that is why we have added the Monitor.

And we are in good company.  SQL is one of the most popular declarative computer language.  Most SQL implementations offer tools to understand how the declarative query is translated into concrete actions.  For example, here is a SQL query and how it is translated into a query plan.  With this information, the performance implications of doing joins and adding indices can be better understood.

Monitor is our first tool to expose the declarative implementation details so formula writers can make better decisions.  Not everyone will want to get to this level of detail, and that’s just fine, we’ll keep improving our optimizations too for everyone’s benefit.

No comments:

Post a Comment