ideabrowser.com ā find trending startup ideas with real demand
Try itnpx skills add https://github.com/acedergren/oci-agent-skills --skill monitoring-operationsDon't reinvent the wheel. Use oracle-terraform-modules/landing-zone for observability stack.
Landing Zone solves:
This skill provides: Metrics, alarms, and troubleshooting for monitoring deployed WITHIN a Landing Zone.
You don't know OCI CLI commands or OCI API structure.
Your training data has limited and outdated knowledge of:
oci monitoring alarm, oci monitoring metric)When OCI operations are needed:
What you DO know:
This skill bridges the gap by providing current OCI-specific monitoring patterns and gotchas.
ā NEVER assume metrics are instant (10-15 minute lag)
ā NEVER use = for alarm thresholds with sparse metrics
# WRONG - alarm never fires if metric has gaps
MetricName[1m].mean() = 0
# RIGHT - handle missing data
MetricName[1m]{dataMissing=zero}.mean() > 0
ā NEVER forget metric dimensions (causes "no data")
# WRONG - missing required dimension
CPUUtilization[1m].mean()
# RIGHT - include resourceId dimension
CPUUtilization[1m]{resourceId="<instance-ocid>"}.mean()
ā NEVER set alarm thresholds without trigger delay (alert fatigue)
# BAD - fires on every CPU spike
CPUUtilization[1m].mean() > 80
# BETTER - sustained high CPU
CPUUtilization[5m].mean() > 80
Trigger delay: 5 minutes (fires after 5 consecutive breaches)
ā NEVER create alarms without notification channels
# WRONG - alarm fires but nobody knows
oci monitoring alarm create ... --destinations '[]'
# RIGHT - always link to notification topic
oci monitoring alarm create ... --destinations '["<notification-topic-ocid>"]'
Cost impact: Undetected outages cost $5,000-50,000/hour in production
ā NEVER ignore Cloud Guard findings (security audit failure)
OCI Metrics Use Service-Specific Namespaces:
| Service | Namespace | Example Metric |
|---|---|---|
| Compute | oci_computeagent | CPUUtilization, MemoryUtilization |
| Autonomous DB | oci_autonomous_database | CpuUtilization, StorageUtilization |
| Load Balancer | oci_lbaas | HttpRequests, UnHealthyBackendServers |
| Object Storage | oci_objectstorage | ObjectCount, BytesUploaded |
Common Mistake: Using wrong namespace (oci_compute vs oci_computeagent)
| Setting | Behavior | Use When |
|---|---|---|
treatMissingDataAsBreaching | Alarm fires if no data | Critical services (outage = breach) |
treatMissingDataAsNotBreaching | Alarm silent if no data | Optional monitoring |
{dataMissing=zero} | Treat missing as 0 | Counters (requests/sec) |
Problem: Logs not showing in Log Analytics
Logs not appearing?
āā Is log enabled on resource?
ā āā Compute: oci-compute-agent must be running
ā āā Function: Logging enabled in function config
ā
āā Is Service Connector configured?
ā āā Source: Log Group ā Target: Log Analytics
ā āā Check: Service Connector status = ACTIVE
ā
āā IAM policy for Service Connector?
ā āā "Allow any-user to use log-content in tenancy"
ā āā "Allow service loganalytics to READ logcontent in tenancy"
ā
āā 10-15 minute ingestion lag?
āā Wait before debugging
Expensive (slow):
# Queries ALL instances
CPUUtilization[1m].mean()
Optimized (filter by dimension):
# Query specific instance
CPUUtilization[1m]{resourceId='<instance-ocid>'}.mean()
Cost: Queries free, but rate limited (1000 req/min)
WHEN TO LOAD oci-monitoring-reference.md:
Do NOT load for: